background preloader


Facebook Twitter

Data Import/Export

Data Wrangling. Workflow, Collaboration, Reproducibility. Packages. Text/Strings. Graphics. R Apps - Shiny, Plumber APIs. Functions. Learning R. Teaching R. Examples. RStudio Cloud. RStudio - Databricks. R in the Windows Subsystem for Linux. R has been available for Windows since the very beginning, but if you have a Windows machine and want to use R within a Linux ecosystem, that's easy to do with the new Fall Creator's Update (version 1709).

R in the Windows Subsystem for Linux

If you need access to the gcc toolchain for building R packages, or simply prefer the bash environment, it's easy to get things up and running. Once you have things set up, you can launch a bash shell and run R at the terminal like you would in any Linux system. And that's because this is a Linux system: the Windows Subsystem for Linux is a complete Linux distribution running within Windows.

This page provides the details on installing Linux on Windows, but here are the basic steps you need and how to get the latest version of R up and running within it. First, Enable the Windows Subsystem for Linux option. Next, you'll need to install your preferred distribution of Linux from the Microsoft Store. OpenCPU. Awesome-R. Blockspring. In this tutorial we'll show you how to add an R function to Blockspring.


Doing so is helpful if you need to: Expose your function to the world. Need some R in your Rails app, Android app, Raspberry Pi, or even in Google Spreadsheets? You'll be able to run your function from any application and device by simply copy/pasting a single line of code. Share your function with anyone. Basic Excel R Tookit. Hire-an-r-programmer. A list of R conferences and meetings. Search for Key Words or Phrases in Documentation. Description Search for key words or phrases in help pages, vignettes or task views, using the search engine at and view them in a web browser.

Search for Key Words or Phrases in Documentation

Usage RSiteSearch(string, restrict = c("functions", "vignettes", "views"), format = c("normal", "short"), sortby = c("score", "date:late", "date:early", "subject", "subject:descending", "from", "from:descending", "size", "size:descending"), matchesPerPage = 20) Arguments Details This function is designed to work with the search site at and depends on that site continuing to be made available (thanks to Jonathan Baron and the School of Arts and Sciences of the University of Pennsylvania).

Unique partial matches will work for all arguments. Value (Invisibly) the complete URL passed to the browser, including the query string. Making R Files Executable (under Windows) Although it is reasonable that R scripts get opened in edit mode by default, it would be even nicer (once in a while) to run them with a simple double-click.

Making R Files Executable (under Windows)

Well, here we go ... Choosing a new file extension name (.Rexec) First, we have to think about a new file extension name. Control Structures Loops in R. As part of Data Science tutorial Series in my previous post I posted on basic data types in R.

Control Structures Loops in R

I have kept the tutorial very simple so that beginners of R programming may takeoff immediately. Please find the online R editor at the end of the post so that you can execute the code on the page itself. Descriptive Loops: the best thing I’ve changed about my code in years. Hopefully, one’s coding habits are constantly improving.

Descriptive Loops: the best thing I’ve changed about my code in years

If you feel any doubt about yourself, I suggest looking back at something you wrote 2011. Wonders of foreach. Posted by Andrew B.

Wonders of foreach

Collier on 2013-08-25. Writing code from scratch to do parallel computations can be rather tricky. However, the packages providing parallel facilities in R make it remarkably easy. One such package is foreach. I am going to document my trail of discovery with foreach, which began some time ago, but has really come into fruition over the last few weeks. Five ways to handle Big Data in R. Big data was one of the biggest topics on this year’s useR conference in Albacete and it is definitely one of today’s hottest buzzwords.

Five ways to handle Big Data in R

But what defines “Big Data”? And on the practical side: How can big data be tackled in R? What data is big? Hadley Wickham, one of the best known R developers, gave an interesting definition of Big Data on the conceptual level in his useR! How to speed up R Code: an intro. How-to go parallel in R – basics + tips.

Today is a good day to start parallelizing your code.

How-to go parallel in R – basics + tips

I’ve been using the parallel package since its integration with R (v. 2.14.0) and its much easier than it at first seems. Speed Up Your Code: Parallel Processing with multidplyr. There’s nothing more frustrating than waiting for long-running R scripts to iteratively run.

Speed Up Your Code: Parallel Processing with multidplyr

I’ve recently come across a new-ish package for parallel processing that plays nicely with the tidyverse: multidplyr. The package has saved me countless hours when applied to long-running, iterative scripts. Future.apply - Parallelize Any Base R Apply Function. Caching Via Background R Processes. Rstudio Jobs: training models in parallel. Using `source()` while executing a local job. Using memoise to cache R values. The memoise package can be very handy for caching the results of slow calculations. In interactive work, the slowest calculations can be reading data, so that is demonstrated here. The microbenchmark package shows timing results. Setup First, load the package being tested, and also a benchmarking package.

Optimize your R Code using Memoization. Growing vectors in a loop. How to select a seed for simulation or randomization. If you need to generate a randomization list for a clinical trial, do some simulations or perhaps perform a huge bootstrap analysis, you need a way to draw random numbers. Putting many pieces of paper in a hat and drawing them is possible in theory, but you will probably be using a computer for doing this. The computer, however, does not generate random numbers. It generates pseudo random numbers. They look and feel almost like real random numbers, but they are not random. Each number in the sequence is calculated from its predecessor, so the sequence has to begin somewhere; it begins in the seed – the first number in the sequence. Knowing the seed is a good idea. Using the same seed every time is not a good idea. How do you select the seed? The best practice is to choose a random seed, but this creates a magic circle.

This is how you do it in R by using the Sys.time() function: Get the system time, convert it to an integer, and you’re done. Progress bar in R. A decent percentage of working time in R, I spend looping over chromosomes, transcription factors or tissues, usually, using parallelization. To get the stuff to run simultaneously I use the foreach function from the doMC package, and for monitoring of the progress of the execution, I made use of the cat/print functions, which mostly just clutter the terminal. Today a friend of mine set me a link to this blog, and I was dumbfounded when I read that the R base package has an inbuilt function for a nice looking progress bar.

To my luck, it works perfectly when put inside a foreach loop. library(doMC)registerDoMC(5)total <- 200# create progress barpb <- txtProgressBar(min = 0, max = total, style = 3)foreach(i = 1:total)%dopar%{ Sys.sleep(0.1) setTxtProgressBar(pb, i)}close(pb) Progressr. The progressr package provides a minimal API for reporting progress updates in R. The design is to separate the representation of progress updates from how they are presented. What type of progress to signal is controlled by the developer. How these progress updates are rendered is controlled by the end user.

Progressr slides from e-Rum 2020. I presented Progressr: An Inclusive, Unifying API for Progress Updates (15 minutes; 20 slides) at e-Rum 2020, on June 17, 2020: HTML (incremental Google Slides; requires online access)PDF (flat slides)AbstractVideo - to be posted by the organizers I am grateful for everyone involved who made e-Rum 2020 possible. Notifications from R. Shutdown Windows after Script Has Finished. Benchmarkme. Bench: Timing hash functions. High Precision Timing of R Expressions.

List of useful RStudio addins made by useRs. RStudio addins manager. RStudio add-in to schedule R scripts. A Gadget for tidyr. RStudio addin for selecting colours, and another for adding marginal density plots to ggplot2. Citr addin. Remedy: RStudio Addins to Simplify Markdown Writing. Styler - A non-invasive source code formatter for R. I am pleased to announce that the R package styler, which I have worked on through Google Summer of Code 2017 with Kirill Müller and Yihui Xie, has reached a mature stage.

Regexplain: Rstudio addin to help you with your regexes. ViewPipeSteps: Create tabs of View() output for each chained pipe. Fryingpane: Serve datasets from a package inside the RStudio Connection Pane. AlignAssign: Align the assignment operators within a highlighted area. RStudio addins part 1 - code reproducibility testing. This is the first post in the RStudio:addins series. The aim of the series is to walk the readers through creating an R package that will contain functionality for integrating useful addins into the RStudio IDE. At the end of this first article, your RStudio will be 1 useful addin richer. Read-only scripts.

Sometimes you just want a project-less RStudio session. If you’ve ever been to an R workshop I gave, you probably heard me say “if the only thing you get out of this workshop is that RStudio projects are awesome and you should use them, this workshop was worth your time”. And I stand by this statement, they are awesome! RStudio addins for network analysis. A new version of the snahelper package is now available on CRAN. If you do not now the package: So far, it included one RStudio addin that provided a GUI to analyze and visualize networks. Check out the introductory post for more details. This major update includes two more addins that further facilitate the work with network data in R. The package requires the newest versions of ggraph (2.0.0) and graphlayouts (0.5.0). Replace in Files. Rsthemes: Full RStudio IDE and Syntax Themes. Hrbraddins. Imageclipr. DataEditR.