Deploying an R Application

This guide shows how to adapt the Deploying an Application article for the R programming language. Be sure to read the general Application Deployment article as we skim over many topics covered in more detail there. The example is derived from Using R for Time Series Analysis.

You can find the sample app in our public source code repository (basic-examples/deployapp/skirts).

Sample R App

Our R application will be called skirts. It will contain a single R script called skirts_smooth.R. When run, the script downloads a dataset of women skirt hem sizes from 1866 to 1911, runs a smoothing algorithm on the dataset, and then outputs a plot.

Contents of skirts_smooth.R:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Save output as pdf
pdf('charts.pdf')

# Download skirts.dat from the web
skirts <- scan("http://robjhyndman.com/tsdldata/roberts/skirts.dat",skip=5)

# Create timeseries
skirtsseries <- ts(skirts,start=c(1866))

# Plot raw data
plot.ts(skirtsseries)

# Smooth using Holt's exponential smoothing
skirtsseriesforecasts <- HoltWinters(skirtsseries, gamma=FALSE)
skirtsseriesforecasts
skirtsseriesforecasts$SSE

# Plot smoothed data
plot(skirtsseriesforecasts)

You can run skirts_smooth.R on your machine with the following command:

$ R --no-save < skirts_smooth.R

We use an input file so that R runs the script, rather than opening up in interactive mode. --no-save means that the R workspace should not be saved upon completion of the script.

Upon completion, the script’s result is a PDF file called charts.pdf. charts.pdf has two graphs. The original plot of hem sizes over the years, and a second smoothed version.

Creating an App

Create a directory called skirts containing skirts_smooth.R:

skirts/
skirts/skirts_smooth.R

To deploy on PiCloud, we first create a Volume for the skirts application.

$ picloud volume create skirts skirts

Now from within the skirts directory, run the following command to synchronize the directory with your volume.

$ picloud volume sync * skirts:

Running your App

To run skirts on the cloud, we’ll use picloud exec:

$ picloud exec -v skirts -w skirts -r charts.pdf 'R --no-save < skirts_smooth.R'
[jid]

-v says to use the skirts volume. -w says that the job’s home directory should be in /home/picloud/skirts. -r says that the file charts.pdf should be considered the result of the job.

To get the result of the job, use picloud result and redirect the standard output to a file.

$ picloud result [jid] > charts.pdf

You can now view the charts.pdf that was generated on PiCloud.

Using Your Bucket for Input and Output

We can modify skirts_smooth.R to use your Bucket for input and output.

First, download skirts.dat from http://robjhyndman.com/tsdldata/roberts/skirts.dat and store it in your bucket:

$ wget http://robjhyndman.com/tsdldata/roberts/skirts.dat # Mac & Linux only
$ picloud bucket put skirts.dat skirts.dat

Now modify skirts_smooth.R as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Save output as pdf
pdf('charts.pdf')

# Download skirts.dat from bucket
system("picloud bucket get skirts.dat .")

# Download skirts.dat from the web
skirts <- scan("skirts.dat",skip=5)

# Create timeseries
skirtsseries <- ts(skirts,start=c(1866))

# Plot raw data
plot.ts(skirtsseries)

# Smooth using Holt's exponential smoothing
skirtsseriesforecasts <- HoltWinters(skirtsseries, gamma=FALSE)
skirtsseriesforecasts
skirtsseriesforecasts$SSE

# Plot smoothed data
plot(skirtsseriesforecasts)

# Save output chart to bucket
system("picloud bucket put charts.pdf charts.pdf")

As you can see, using the R system function, we are able to make calls to the picloud Command-Line Interface. Lines 4-8 download skirts.dat from your bucket, and then read it into R. Line 25 saves the result of the job to your bucket with key charts.pdf.

To download the result for viewing:

$ picloud bucket get charts.pdf .

Handling Larger Datasets

If you want to operate on a large dataset, we recommend using a Volume. A volume will appear as a directory on the filesystem, which will make inter-operating with existing R code seamless.