background preloader

Introduction to Pig Latin - Programming Pig

Introduction to Pig Latin - Programming Pig
Open Feedback Publishing System (OFPS) is now retired. Thank you to the authors and commenters who participated in the program. OFPS was an O'Reilly experiment that demonstrated the benefits of bridging the gap between private manuscripts and public blogs. Readers gained access to in-progress O'Reilly manuscripts and were able to communicate suggestions with the authors, follow others' comments, and directly participate in the development of new books. Additionally, authors published their in-progress work whenever they thought it ready for public comment and were able to update the site with new versions as the content was improved. Many of the book projects that were in OFPS have been migrated to the Atlas Reader.

http://www.oreilly.com/ofps/

Related:  Big Data

Top 10 data mining algorithms in plain English Today, I’m going to explain in plain English the top 10 most influential data mining algorithms as voted on by 3 separate panels in this survey paper. Once you know what they are, how they work, what they do and where you can find them, my hope is you’ll have this blog post as a springboard to learn even more about data mining. What are we waiting for? Let’s get started! Update 16-May-2015: Thanks to Yuval Merhav and Oliver Keyes for their suggestions which I’ve incorporated into the post. Update 28-May-2015: Thanks to Dan Steinberg (yes, the CART expert!)

Programming Android  At over 500 pages this book covers a lot of material. The authors have extensive experience and provide good examples that you can download from the companion web site - if you want a book to work through to learn programming in Android that you can have confidence in this could be the one. Even though the authors expect you to have some programming experience the early chapters provide a good explanation of Object Theory and the Java Programming Language and a very good explanation of Threading. It is in Part 3 that we really start to get going with a skeleton application that you can use as a template for your own projects. Here the authors are very good at explaining the 'gotchas' that can catch you out e.g. Adding Security and Membership This article explains how to secure an ASP.NET Web Pages (Razor) website so that some of the pages are available only to people who log in. (You'll also see how to create pages that anyone can access.) What you'll learn: How to create a website that has a registration page and a login page so that for some pages you can limit access to only members. How to create public and member-only pages. How to define roles, which are groups that have different security permissions on your site, and how to assign users to a role.

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison (Yes it's a long title, since people kept asking me to write about this and that too :) I do when it has a point.) While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.) But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another.

Commentpress ¶ 1 CommentPress is an open source theme and plugin for the WordPress blogging engine that allows readers to comment paragraph-by-paragraph, line-by-line or block-by-block in the margins of a text. New in CommentPress 3.8: select some text and comment specifically on that selection. Annotate, gloss, workshop, debate: with CommentPress you can do all of these things on a finer-grained level, turning a document into a conversation. It can be applied to a fixed document (paper/essay/book etc.) or to a running blog. Use it in combination with multisite, BuddyPress and BuddyPress Groupblog to create communities around your documents. Scheduling in Hadoop Hadoop is a general-purpose system that enables high-performance processing of data over a set of distributed nodes. But within this definition is the fact that Hadoop is a multi-tasking system that can process multiple data sets for multiple jobs for multiple users at the same time. This capability of multi-processing means that Hadoop has the opportunity to more optimally map jobs to resources in a way that optimizes their use. Up until 2008, Hadoop supported a single scheduler that was intermixed with the JobTracker logic. Although this implementation was perfect for the traditional batch jobs of Hadoop (such as log mining and Web indexing), the implementation was inflexible and could not be tailored.

Introduction to Electronics Next: Contents Contents Theory Notes: Dr.Yatindra Nath Singh Electrical Engineering Department IIT Kanpur-208016 Lab material: Dr.Joseph John Electrical Engineering Department IIT Kanpur-208016 Reviwed by: Dr.S.P.Das Electrical Engineering Department IIT Kanpur-208016 Using OAuthWebSecurity without SimpleMembership « brockallen I’ve been researching the new support in ASP.NET for OAuth and OpenID authentication. It provides a nice and easy to use wrapper on DotNetOpenAuth. The main APIs are on the OAuthWebSecurity class and they provide methods to authenticate against your OAuth and OpenID providers as well as associate those OAuth and OpenID accounts to an account with your local membership provider (and strictly speaking your simple membership provider).

Full text search in in Rails with Sunspot and Solr « TechBot The book you should get to dig deeper into Solr Click here if you want to see a PDF version of this tutorial. Full source code for this tutorial is available at GitHub. Without Gods Retreat to My Study After a year of mostly daily blogging on this site, I am cutting back. As most of you know, I am writing a book on the history of disbelief for Carroll and Graf. The blog -- produced while working on the book -- was an experiment conceived by the Institute for the Future of the Book. It has been a success. I have been benefiting from informed and insightful comments by readers of the blog as I've tested some ideas from this book and explored some of their connections to contemporary debates.

Observations About Streaming Data Analytics for Science I recently had the pleasure of attending two excellent workshops on the topic of streaming data analytics and science. A goal of the workshops was to understand the state of the art of “big data” streaming applications in scientific research and, if possible, identify common themes and challenges. Called Stream2015 and Stream2016, these meetings were organized by Geoffrey Fox, Lavanya Ramakrishnan and Shantenu Jha. The talks at the workshop were from an excellent collection of scientists from universities and the national labs and professional software engineers who are building cloud-scale streaming data tools for the Internet industry. First it is important to understand what we mean by streaming data analytics and why it has become so important.

Serial Programming This book explains different aspects of serial data communication. Serial data communications is the foundation for most forms of data communications used with modern computing devices. The focus of the articles in this book will be around the implementation of RS-232 (aka RS-232C, aka V.24, aka EIA-232D, etc.) based serial data communication and will explore a wide range of implementations and uses for serial data transfer.

Extensions for ASP.NET MVC Demos Grid The Grid widget displays tabular data and offers rich support for interacting with data; including paging, sorting, grouping, and selection. Scheduler Nuts & Bolts: Campfire loves Erlang. A couple of years ago a lot of buzz started in the Ruby community about Erlang, a functional programming language developed by Ericsson originally for use in telecommunications systems. I was intrigued by the talk of fault tolerance and concurrency, two of the cornerstones that Erlang was built on, so I ordered the Programming Erlang book written by Joe Armstrong and published by the Pragmatic Programmers and spent a couple of weeks working through it. A year later, Kevin Smith began producing his excellent Erlang in Practice screencast series in partnership with the Pragmatic Programmers. It’s amazing how much difference it made for me to be able to watch someone develop Erlang applications while talking through his thought process along the way.

Related:  #ReviewDev.WebScalaabedonlyRubiscala