background preloader

Benco

Facebook Twitter

Plotsuper

Eight Terminal Utilities Every OS X Command Line User Should Know · mitchchn.me. The OS X Terminal opens up a world of powerful UNIX utilities and scripts.

Eight Terminal Utilities Every OS X Command Line User Should Know · mitchchn.me

If you’re migrating from Linux, you’ll find many familiar commands work the way you expect. But power users often aren’t aware that OS X comes with a number of its own text-based utilities not found on any other operating system. Learning about these Mac-only programs can make you more productive on the command line and help you bridge the gap between UNIX and your Mac. Update: Thanks to reader feedback, I’ve written about a few more commands in a follow-up post: (And eight hundred more). 1. open open opens files, directories and applications. .

$ open /Applications/Safari.app/ …will launch Safari as if you had double-clicked its icon in the Finder. If you point open at a file instead, it will try to load the file with its associated GUI application. open screenshot.png on an image will open that image in Preview. Running open on a directory will take you straight to that directory in a Finder window. . $ ls ~ | pbcopy. Intro to The data.table Package. Data Frames R provides a helpful data structure called the “data frame” that gives the user an intuitive way to organize, view, and access data.

Intro to The data.table Package

Many of the functions that you would use to read in external files (e.g. read.csv) or connect to databases (RMySQL), will return a data frame structure by default. While there are other important data structures, such as the vector, list and matrix, the data frame winds up being at the heart of many operations not the least of which is aggregation. Before we get into that let me offer a very brief review of data frame concepts: 10 R packages I wish I knew about earlier. I started using R about 3 years ago.

10 R packages I wish I knew about earlier

It was slow going at first. R had tricky and less intuitive syntax than languages I was used to, and it took a while to get accustomed to the nuances. It wasn't immediately clear to me that the power of the language was bound up with the community and the diverse packages available. R can be more prickly and obscure than other languages like Python or Java. The good news is that there are tons of packages which provide simple and familiar interfaces on top of Base R. Sqldf install.packages("sqldf") One of the steepest parts of the R learning curve is the syntax. Introducing xda: R package for exploratory data analysis.

This R package contains several tools to perform initial exploratory analysis on any input dataset.

Introducing xda: R package for exploratory data analysis

It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package can be used to get a good sense of any dataset before jumping on to building predictive models. You can install the package from GitHub. The functions currently included in the package are mentioned below: Installation To install the xda package, devtools package needs to be installed first. Then, use the following commands to install xda: Usage For all examples below, the popular iris dataset has been used.

The package is constantly under development and more functionalities will be added soon. Related. Stock Price Prediction With Big Data and Machine Learning - Eugene Zhulenev. Apache Spark and Spark MLLib for building price movement prediction model from order log data.

Stock Price Prediction With Big Data and Machine Learning - Eugene Zhulenev

The code for this application app can be found on Github Synopsis This post is based on Modeling high-frequency limit order book dynamics with support vector machines paper. Roughly speaking I’m implementing ideas introduced in this paper in scala with Spark and Spark MLLib. Authors are using sampling, I’m going to use full order log from NYSE (sample data is available from NYSE FTP), just because I can easily do it with Spark. If you want to get deep understanding of the problem and proposed solution, you need to read the paper. Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. Model Architecture In the table, each row of the message book represents a trading event that could be either a order submission, order cancellation, or order execution.

Feature Extraction and Training Data Preparation. GitHub - ezhulenev/orderbook-dynamics: Modeling high-frequency limit order book dynamics with support vector machines. Getting started with PostgreSQL in R. When dealing with large datasets that potentially exceed the memory of your machine it is nice to have another possibility such as your own server with an SQL/PostgreSQL database on it, where you can query the data in smaller digestible chunks.

Getting started with PostgreSQL in R

For example, recently I was facing a financial dataset of 5 GB. Although 5 GB fit into my RAM the data uses a lot of resources. One solution is to use an SQL-based database, where I can query data in smaller chunks, leaving resources for the computation. While MySQL is the more widely used, PostgreSQL has the advantage of being open source and free for all usages. However, we still need to get a server. First, we need to install the necessary software. Now we can already access and use the database, for example we can start the interface (pgAdmin III) that was automatically installed with PostgreSQL. Pg_ctl -D "C:Program FilesPostgreSQL9.4data" start As we can see, we only have one user (“postgres“). Cd C:/Program Files/PostgreSQL/9.4/bin Related.