background preloader

Databases

Facebook Twitter

SQL Server Schemas & R Tip. I ran into an issue the other day where I was tring to write a new table to a SQL Server Database with a non-default schema. I did end up spending a bit of time debugging and researching so I wanted to share for anyone else that runs into the issue. Using the DBI::Id function, allows you to specify the schema when you are trying to write a table to a SQL Server database. DBI::dbWriteTable(con, DBI::Id(schema = "schema", table = "tablename"), df) But the code above will return a strange error: After some investigation I found a workaround to be able to write the table. For non-default schemas, a “_” needs to in the table name for it to work. DBI::dbWriteTable(con, DBI::Id(schema = "schema", table = "tablename_"), df) This really isn’t ideal for naming conventions so using the t-sql command sp_rename will rename the table to what I originally wanted. DBI::dbWriteTable(con, DBI::Id(schema = "schema", table = "tablename"), df) DBI::dbGetQuery(con, "USE database; EXEC sp_rename '[schema].

Connecting to databases using RODBC on shinyapps.io. Database Queries With R · R Views. By Nathan Stephens There are many ways to query data with R. This post shows you three of the most common ways: Using DBIUsing dplyr syntaxUsing R Notebooks Background Several recent package improvements make it easier for you to use databases with R. The query examples below demonstrate some of the capabilities of these R packages. DBI. RStudio also made recent improvements to its products so they work better with databases. RStudio IDE (v1.1). Using databases with R is a broad subject and there is more work to be done. Example: Query bank data in an Oracle database In this example, we will query bank data in an Oracle database. Library(DBI) library(dplyr) library(dbplyr) library(odbc) con <- dbConnect(odbc::odbc(), "Oracle DB") 1. You can query your data with DBI by using the dbGetQuery() function. 2. You can write your code in dplyr syntax, and dplyr will translate your code into SQL. 3.

Did you know that you can run SQL code in an R Notebook code chunk? Summary. A modern database interface for R (Revolutions) At the useR! Conference last month, Jim Hester gave a talk about two packages that provide a modern database interface for R. Those packages are the odbc package (developed by Jim and other members of the RStudio team), and the DBI package (developed by Kirill Müller with support from the R Consortium). To communicate with databases, a common protocol is ODBC.

ODBC is an open, cross-platform standard for interfacing with databases, whatever the database software and variant of the SQL language it implements. R has long had the RODBC package created by Brian Ripley, but the new odbc package provides an updated alternative. The odbc package is a from-the-ground-up implementation of an ODBC interface for R that provides native support for additional data types (including dates, timestamps, raw binary, and 64-bit integers) and parameterized queries.

With the odbc package (and the DBI package, which provides low-level connectivity to the database), you create a database connection like this: Untitled. Recently I have been starting to use dplyr for handling my data in R. It makes everything a lot smoother! My previous workflow – running an SQL query, storing the results as CSV, loading it in RStudio – is now history. With dplyr you can directly query data from many different databases in a very convenient way.

Unfortunately Microsoft SQL Server is not directly supported but by using the package RSQLServer it can be done like with any other database. In this blog post I’ll explain how I installed everything on my Windows 7 machine to access MSSQL Server with R, since it was not as straight forward as one might think. The package RSQLServer is not available on CRAN anymore but it can be installed from the github repo imanuelcostigan/RSQLServer. # install.packages('devtools') devtools::install_github('imanuelcostigan/RSQLServer') If this works you’re lucky and already have all the necessary things installed. Fixing the installation To resolve this: For the local one I got an error message. Asdfree: MonetDBLite because fast. MonetDBLite is a SQL database that runs inside the R environment for statistical computing and does not require the installation of any external software. MonetDBLite is based on free and open-source MonetDB, a product of the Centrum Wiskunde & Informatica.

MonetDBLite is similar in functionality to RSQLite, but typically completes queries blazingly fast due to its columnar storage architecture and bulk query processing model. Since both of these embedded SQL options rely on the the R DBI interface, the conversion of legacy RSQLite project syntax over to MonetDBLite code should be a cinch. MonetDBLite works seamlessly with the dplyr grammar of data manipulation.

For a detailed tutorial of how to work with database-backed dplyr commands, see the dplyr databases vignette. To reproduce this vignette using MonetDBLite rather than RSQLite, simply replace the functions ending with *_sqlite with the suffix *_monetdb instead. Installation Speed Comparisons Painless Startup Versatile Data Importation. Database interfaces. There are many different databases. The most familiar are row-column SQL databases like MySQL, SQLite, or PostgreSQL. Another type of database is the key-value store, which as a concept is very simple: you save a value specified by a key, and you can retrieve a value by its key.

One more type is the document database, which instead of storing rows and columns, stores blobs of text or even binary files. The key-value and document types fall under the NoSQL umbrella. What is the difference between SQL and NoSQL (key-value, document)? NoSQL is often interpreted as Not only SQL - meaning a database that is called a NoSQL database may contain some notion of row-column storage, but other details diverge from traditional SQL databases. If you aren't already using databases, why care about databases? Use case 1: Let's say you are producing a lot of data in your lab - millions of rows of data. rOpenSci has an increasing suite of database tools: If you're wondering what database to use: elastic. Text bashing in R for SQL | Bearded Analytics. Fairly often, a coworker who is strong in Excel, but weak in writing code will come to me for help in special details about customers in their datasets.

Sometimes the reason is to call, email, or snail mail a survey, other times to do some classification grouping on the customer. Whatever the reason, the coworker has a list of ID numbers and needs help getting something out of a SQL database. When it isn't as simple as just adding quotes and commas to the cells in Excel before copying all the ID's into the WHERE clause of a very basic SELECT statement, I often fall back to R and let it do the work of putting together the SELECT statement and querying the data.

Suppose that you're given an Excel file with 1.2 million ID's and there's some transformation that you need to do first. Obviously, you first read the file in using your protocol and package of choice. Since we're ultimately doing SQL, let's take advantage of the RODBC package's cool features. Notice the use of the \s switch. Introducing db.r. If you follow our blog and you've read 10 R packages I wish I knew about earlier, you might remember package #5, the database driver of your choice. I think one of the reasons why many people avoid the R database packages and instead steadily collect piles upon piles of CSVs is that the R database pacakges aren't really all that user friendly.

They can feel a little overwhelming--especially if you're the kind of person who's more interested in R for statis than for managing a database. So we've written a package to help make querying databases in R more fun! Think db.py ... but in R We've gotten some great reception to db.py. There have been some really helpful contributions to the project as well. Warning: We've been informed that there's a bug when using db.r with certain versions of RStudio. Connecting to databases But as useful as db.py is for Python, there isn't really a corollary for R. We used the same sort of API and thought concepts from db.py in building db.r. Inspecting Schema. MongoDB – State of the R | joy of data. Naturally there are two reasons for why you need to access MongoDB from R: MongoDB is already used for whatever reason and you want to analyze the data stored therein You decide you want store your data in MongoDB instead of using native R technology like data.table or data.frame In-memory data storage like data.table is very fast especially for numerical data, provided the data actually fits into your RAM – but even then MongoDB comes along with a bag of goodies making it a tempting choice for a number of use cases: Flexible schema-less data structures spatial and textual indexing spatial queries persistence of data easily accessible from other languages and systems In case you would like to learn more about MongoDB then I have good news for you – MongoDB Inc. provides a number of very well made online courses catering to various languages.

An overview you may find here. The good news is – there are two packages available for making R talk to MongoDB. And its result: » Parameterized SQL queries SmarterPoland. 5sie 2014 Mateusz Żółtak asked me to spread the word about his new R package for parameterized SQL queries. Below you can find the copy of package vignette. If you work with SQL in R you may find it useful. The package RODBCext is an extension of the RODBC database connectivity package. It provides support for parameterized queries. It is assumed that you already know the RODBC package and the basics of SQL and ODBC. Parameterized queries (also known as prepared statements) are a technique of query execution which separates a query string from query parameters values. Avoiding SQL injections,speeding up query execution in some scenarios.

Both are discussed below. SQL injection is an attack against your code which uses SQL queries. Even data from trusted data sources (even SQL ones) can cause problems in SQL queries if use improper programming techniques.Are you sure that your data came from a really trusted source? Example – an apostrophe sign in data Example – simple SQL injection Summary.