Facebook Twitter

Resources for learning R. Big data sparks interest in statistical programming languages. Big data is driving the use of statistical programming languages, in particular the open source R language.

Big data sparks interest in statistical programming languages

This month's edition of the Tiobe index, which assesses language popularity based on data from search engines, has the R language ranked 15th, after being 12th last month and 31st a year ago. "Thanks to the big data hype, computational statistics is gaining attention nowadays," Tiobe says in its assessment. "Yes, R is gaining share for a while now," Tiobe Managing Director Paul Jansen said in an email.

"Please note that it is only 1.5 percent now, so it is still not 'a lot of share.' R is a language that is designed to process a lot of data and visualize the results in an easy way. Several other statistical programming languages also show up on the index, including Julia (number 126), LabView (63), Mathematica, (80), MatLab (24), and S (84). The top five spots in the index were: C (17.47 percent), Java, Objective-C (9.06 percent), C++, and C# (4.99). Google Developers R Programming Video Lectures. Speed up your R code using a just-in-time (JIT) compiler. This post is about speeding up your R code using the JIT (just in time) compilation capabilities offered by the new (well, now a year old) {compiler} package.

Speed up your R code using a just-in-time (JIT) compiler

Specifically, dealing with the practical difference between enableJIT and the cmpfun functions. If you do not want to read much, you can just skip to the example part. As always, I welcome any comments to this post, and hope to update it when future JIT solutions will come along. Prelude: what is JIT Just-in-time compilation (JIT): is a method to improve the runtime performance of computer programs. JIT in R To this date, there are two R packages that offers Just-in-time compilation to R users: the {jit} package (through The Ra Extension to R), and the {compiler} package (which now comes bundled with any new R release, since R 2.13).

The {jit} package The {compiler} package The {compiler} package, created by Luke Tierney, offers a byte-code compiler for R: Using the {compiler} package as a JIT for R Description Example. Taking R to the Limit (High Performance Computing in R), Part 1. rCharts. Recently, I had blogged about two R packages, rCharts and rNVD3 that provided R users a lattice like interface to create interactive visualizations using popular javascript libraries.


There was a lot of repeated code between the two packages, which lead me to think that it might be possible to integrates multiple JS libraries into a single package with a common lattice like interface. After heavy refactoring, I finally managed to implement three popular JS libraries in rCharts: Polycharts, NVD3 and MorrisJS. rCharts uses reference classes, which I believe is one of the best things to happen to R.

It allowed me to keep the code base pretty concise, while implementing a fair degree of functionality. The current structure of rCharts should make it easy to integrate any JS visualization library that uses a configuration variable to create charts. A huge advantage of wrapping these libraries within the same package is that they can take advantage of the common code. Example 1: Polycharts. Package samplingVarEst. Taking R to the Limit (High Performance Computing in R), Part 2. Statistics.org.il/wp-content/uploads/2010/04/Big_Memory V0.pdf. An Interactive Introduction To R (Programming Language For Statistics)

Rserve-php - Rserve client php library. If you are into large data and work a lot with package ff. One of the main reasons why I prefer to use it above other packages that allow working with large datasets is that it is a complete set of tools.

If you are into large data and work a lot with package ff

If you disagree, do comment. Next to that there are some extra goodies allowing faster grouping by - not restricted to the ff package alone (Fast groupwise aggregations: bySum, byMean, binned_sum, binned_sumsq, binned_tabulate) > require(ffbase) > hhp <- read.table.ffdf(file="/home/jan/Work/RForgeBNOSAC/github/RBelgium_HeritageHealthPrize/Data/Claims.csv", FUN = "read.csv", na.strings = "") > class(hhp) [1] "ffdf" > dim(hhp) > str(hhp[1:10,]) 'data.frame': 10 obs. of 14 variables: $ MemberID : int 42286978 97903248 2759427 73570559 11837054 45844561 99829076 54666321 60497718 72200595 $ ProviderID : int 8013252 3316066 2997752 7053364 7557061 1963488 6721023 9932074 363858 6251259 $ Vendor : int 172193 726296 140343 240043 496247 4042 265273 35565 293107 791272 $ PCP : int 37796 5300 91972 70119 68968 55823 91972 27294 64913 49465 > ## Some basic showoff.

Mapping Public Opinion: A Tutorial « David B. Sparks. Mapping Public Opinion: A Tutorial Posted by d sparks on July 18, 2012 · 6 Comments At the upcoming 2012 summer meeting of the Society of Political Methodology, I will be presenting a poster on Isarithmic Maps of Public Opinion. Since last posting on the topic, I have made major improvements to the code and robustness of the modeling approach, and written a tutorial that illustrates the production of such maps.

This tutorial is in a very rough draft form, but I will post it here when it is finalized. (An earlier draft had some errors, and so I have taken it down.) Like this: Like Loading... R Offerings. Oracle has adopted R as a language and environment to support Statisticians, Data Analysts, and Data Scientists in performing statistical data analysis and advanced analytics, as well as generating sophisticated graphics.

R Offerings

In addressing the enterprise and the need to analyze Big Data, Oracle provides R integration through four key technologies: Why Oracle for Advanced Analytics? If you're an enterprise company, chances are you have your data in an Oracle database. You chose Oracle for it's global reputation at providing the best software products (and now engineered systems) to support your organization.

Oracle database is known for stellar performance and scalability, and Oracle delivers world class support. If your data is already in Oracle Database or moving in that direction, leverage the high performance computing environment of the database to analyze your data. Oracle wants you to be successful with advanced analytics.

Oracle's Strategy for Advanced Analytics Customer Video Stories. Data Viz (R news & tutorials) An unabashedly narcissistic data analysis of my own tweets.

Data Viz (R news & tutorials)

The… pie( table( whence.i.tweet )) qplot( whence ) + coord_polar() pie( log( table( whence )))+RColorBrewer ggplot (see below) plot( density( tweets.len )) qplot(... stat="density") + geom_density qplot(...stat="bin") + geom_text(...) tweeple tweep... Read more » Bare-bones intro to Plotting options in R If you’re using base::plot in R for the first time (for example if you do plot(pima) or plot(faithful) (use ?? Read more » Tumblr Likes Look at just the first digit and the number of digits. science: 32914, 11566, 4989, 3743, 968, 814, 673, 482, 286, 2811 black and white: 1694, 1167, 1108, 988, 919, 639, 596, 591, 580, 544 lol: 22627, 18100, 17688, 14374, 13459, 12045, 4711, 3779, 36...

Read more » Tumblr Likes Read more » Introducing the Lowry Plot Read more » Pareto plot party! A Pareto plot is an enhanced bar chart. Read more » Visualising questionnaires Last week I was shown the results of a workplace happiness questionnaire. Read more » A big list of the things R can do. R is an incredibly comprehensive statistics package.

A big list of the things R can do

Even if you just look at the standard R distribution (the base and recommended packages), R can do pretty much everything you need for data manipulation, visualization, and statistical analysis. And for everything else, there's more than 5000 packages on CRAN and other repositories, and the big-data capabilities of Revolution R Enterprise. A As a result, trying to make a list of everything R can do is a difficult task. But we've made an effort in this list of R Language Features, a new section on the Revolution Analytics website. R APPLICATIONS and EXTENSIONS*** PROGRAMMING LANGUAGE FEATURESThe asterisks indicate features not part of the standard R distribution, as follows:* Requires Revolution R Enterprise** Requires Revolution R Enterprise for IBM Netezza*** Requires additional open-source community packages from CRAN Click on the links above for details of R's capabilities within each of these sections.

Step up your R capabilities with new tools for increased productivity « Stats raving mad. I guess a lot of us actually use many tools to accomplish various things in their everyday life.

Step up your R capabilities with new tools for increased productivity « Stats raving mad

There is the (not that uncommon) case where you have to build something that others will use in their everyday business life to get insights, information and/or take decisions. The basic implementation scenario here would be to build an excel workbook where you will feed the data and have a overview sheet, named Dashboard…If things are on your side you could set-up a connection to a database (an existing one or one you will create for the data in discussion) and pull data from there. You can build powerful and visually elegant things using this approach.

A cool resource to generate tears of joy among colleagues is Chandoo.org. OK, we all love R. But what about interactive results? Unfortunately you will soon realize that building a highly interactive dashboard has limited added value for complex questions, like the ones that predictive analytics would bomb at your inbox. R. Fun with the googleVis Package for R.

Writing Fast R Code - Part 1. R news & tutorials from the web.