background preloader

Apache Mahout: Scalable machine learning and data mining

Apache Mahout: Scalable machine learning and data mining

http://mahout.apache.org//

Building a recommendation engine, foursquare style Mar 22nd Last summer, foursquare’s employee count had grown a bit beyond our office capacity (as we surged towards 20 employees) and we had people sitting in whatever open space we could find. We were split between floors, parked on folding tables, and crammed into couches and loveseats. In one of those seats, @anoopr was playing around with building a map showing interesting places, which we called “Explore.” In the ensuing eight months, we went through several iterations and evolutions to arrive at the recommendations engine we launched two weeks ago as part of the foursquare 3.0 update.

Weka 3 - Data Mining with Open Source Machine Learning Software in Java Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization.

Cordova About Apache Cordova™ Apache Cordova is a set of device APIs that allow a mobile app developer to access native device function such as the camera or accelerometer from JavaScript. Combined with a UI framework such as jQuery Mobile or Dojo Mobile or Sencha Touch, this allows a smartphone app to be developed with just HTML, CSS, and JavaScript. Slope One Slope One is a family of algorithms used for collaborative filtering, introduced in a 2005 paper by Daniel Lemire and Anna Maclachlan.[1] Arguably, it is the simplest form of non-trivial item-based collaborative filtering based on ratings. Their simplicity makes it especially easy to implement them efficiently while their accuracy is often on par with more complicated and computationally expensive algorithms.[1][2] They have also been used as building blocks to improve other algorithms.[3][4][5][6][7][8][9] They are part of major open-source libraries such as Apache Mahout and Easyrec. Item-based collaborative filtering of rated resources and overfitting[edit]

So, you want to build a recommendation engine? At PredictiveIntent, we had a lot of enquiries from people at companies who were not sure whether to build their own recommendation engine, plug in a lightweight recommendations solution, or dedicate some time to implementing “personalisation” properly. Our advice usually consists of three main points: Focus on your goals – will spending too much time building a recommendation engine take your development cycle off track?

COC131 Data Mining, Tuotorials Weka "The overall goal of our project is to build a state-of-the-art facility for developing machine learning (ML) techniques and to apply them to real-world data mining problems. Our team has incorporated several standard ML techniques into a software "workbench" called WEKA, for Waikato Environment for Knowledge Analysis. With it, a specialist in a particular field is able to use ML to derive useful knowledge from databases that are far too large to be analysed by hand. WEKA's users are ML researchers and industrial scientists, but it is also widely used for teaching."

MySQL Engines: InnoDB vs. MyISAM – A Comparison of Pros and Cons The 2 major types of table storage engines for MySQL databases are InnoDB and MyISAM. To summarize the differences of features and performance, InnoDB is newer while MyISAM is older.InnoDB is more complex while MyISAM is simpler.InnoDB is more strict in data integrity while MyISAM is loose.InnoDB implements row-level lock for inserting and updating while MyISAM implements table-level lock.InnoDB has transactions while MyISAM does not.InnoDB has foreign keys and relationship contraints while MyISAM does not.InnoDB has better crash recovery while MyISAM is poor at recovering data integrity at system crashes.MyISAM has full-text search index while InnoDB has not.

Electronic Product Code The Electronic Product Code (EPC) is designed as a universal identifier that provides a unique identity for every physical object anywhere in the world, for all time. Its structure is defined in the EPCglobal Tag Data Standard [1], which is an open standard freely available for download from the website of EPCglobal, Inc.. The canonical representation of an EPC is a URI, namely the 'pure-identity URI' representation that is intended for use when referring to a specific physical object in communications about EPCs among information systems and business application software. The EPCglobal Tag Data Standard also defines additional representations of an EPC identifier, such as the tag-encoding URI format and a compact binary format suitable for storing an EPC identifier efficiently within RFID tags (for which the low-cost passive RFID tags typically have limited memory capacity available for the EPC/UII memory bank). EPCs are not designed exclusively for use with RFID data carriers.

Doubly linked list A doubly-linked list whose nodes contain three fields: an integer value, the link to the next node, and the link to the previous node. The two node links allow traversal of the list in either direction. While adding or removing a node in a doubly-linked list requires changing more links than the same operations on a singly linked list, the operations are simpler and potentially more efficient (for nodes other than first nodes) because there is no need to keep track of the previous node during traversal or no need to traverse the list to find the previous node, so that its link can be modified. Nomenclature and implementation[edit] Basic algorithms[edit]

Octave GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language is quite similar to Matlab so that most programs are easily portable.

Jetspeed 2 - Jetspeed 2 Home Page Jetspeed is an Open Portal Platform and Enterprise Information Portal, written entirely in open source under the Apache license in Java and XML and based on open standards. All access to the portal is managed through a robust portal security policy. Within a Jetspeed portal, individual portlets can be aggregated to create a page. Each portlet is an independent application with Jetspeed acting as the central hub making information from multiple sources available in an easy to use manner.

Related: