Environments enable you to use any library or program you need for your computation.
Fundamentally, libraries and programs are just files that a job needs to be able to find on the filesystem. By creating an environment, you are able to modify the filesystem your job sees so that it contain the dependencies it needs.
Environments support programs and libraries compatible with Linux running on x86-64 architecture.
Think of an environment as a customized filesystem. When your job runs, it is running on top of a Linux filesystem that it has access to. This can be seen by running a job that lists the contents of the root of the filesystem.
For example, in Python:
>>> import cloud >>> import os >>> jid = cloud.call(lambda: os.listdir('/')) # list contents of root dir >>> cloud.result(jid) ['var', 'tmp', 'etc', 'usr', 'home', 'dev', 'bin', 'lib', 'lib64', 'mnt', 'run', 'proc', 'root', 'sbin', 'srv', 'sys']
In the shell,
$ picloud exec ls / [jid] $ picloud result [jid] bin boot dev etc home lib lib64 media mnt proc root run sbin selinux srv sys tmp usr var
The contents of the filesystem seen above are called our base environment. It’s the default, hence base, filesystem available to you. The environment feature gives you the power to modify a base environment to include the programs and libraries your jobs need.
We provide three base environments. By default, picloud uses the Ubuntu 11.04 Natty base environment. When using cloud, the base environment used by default depends on the version of Python you’re using: Maverick for 2.6 and Natty for 2.7.
|Environment Name||Distribution||Python Version||Contents|
|/base/maverick||Ubuntu Maverick 10.10||2.6||Maverick Contents|
|/base/natty||Ubuntu Natty 11.04||2.7||Natty Contents|
|/base/precise||Ubuntu Precise 12.04||2.7||Precise Contents|
Click on a base environment above to see what’s installed.
In Python, to manually specify the base environment, use the _env keyword:
>>> cloud.call(f, _env='base/precise')
In the shell, use the -e flag:
$ picloud exec -e base/precise program
If the base environment you’re using does not contain what you need, you will need to use an environment. Depending on whether you’re using our Python library cloud or our CLI picloud, knowing when you need an environment differs.
If you’ve been using Python-only packages with our cloud library, you’ve probably become accustomed to our Automagic Dependency Transfer. Code such as the following works straight out of the box without you needing to deploy your_expansive_library_of_functions manually to PiCloud.
>>> import cloud >>> from your_expansive_library_of_functions import complex_function >>> # cloud.call transfers all the modules needed to run complex_function on PiCloud >>> cloud.call(complex_function)
However, the cloud library can only transfer pure Python modules. If you need access to Python modules that are non-Python, such as C-extensions, then you’ll need to install it via an environment.
Here we’ll show an example using the Obspy package, which is a Python toolbox for processing seismological data. The examples assume that obspy is installed locally.
>>> def f(): ... import obspy ... >>> f() # works because obspy is installed locally >>> jid = cloud.call(f) >>> cloud.result(jid) [Mon Sep 19 13:32:29 2012] - [WARNING] - Cloud: Job 1337 threw exception: Could not depickle job Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/cloud/serialization/cloudpickle.py", line 679, in subimport __import__(name) ImportError: No module named obspy
As you can see, f() worked locally, but failed to run on PiCloud because obspy is not available on the base environment. In Create a new Environment, we’ll see how to resolve this.
The determination for when to use an environment for a program invoked by the CLI is simpler. Since picloud does not copy executables from your machine to PiCloud automatically, anything that is not on the base environment that you need will have to be deployed with an environment.
In this example, we try to use the convert program that comes with ImageMagick.
$ picloud exec convert [jid] $ picloud result [jid] Job [jid]: Traceback (most recent call last): CloudException: command terminated with nonzero return code 127 $ picloud info [jid] Info for jid [jid] status: error stderr: /bin/sh: convert: not found
The call to convert fails. When we examine the stderr, we can see that it was because convert was not found. Looking at the contents of the Ubuntu Natty 11.04 base environment, we shouldn’t be surprised since imagemagick is not included in the list.
The simplest way to manage environments is through the Environments Dashboard.
Go to the Environments tab.
Click the “create new environment” button.
Choose the base environment most useful for you, keeping in mind that if you’re using cloud, you will want to pick a base environment with a compatible Python version. This is described in the Base Environment introductory section.
The Environment Name is the name you’ll use to reference the environment in your jobs. The Environment Description is for yourself and/or your team to keep track of the purpose and contents of each environment.
For our example, let’s name the environment sample_env.
When you click submit, your environment will appear under the “Environments being configured” section. You may have to wait a minute or two while we boot and configure a setup server for you. The setup server is a temporary machine that represents your environment. Changes you make to the setup server will be reflected in your environment when you save it.
When the setup server is ready, click the “connect” icon if you are using a web browser that supports websockets, and an SSH session will be started for you.
Otherwise, you can use an SSH client of your choice and follow the instructions provided by clicking on the “key” icon. Note that the instructions are tailored towards *nix systems. If you are using Windows and do not have an SSH client, we recommend Tunnelier.
Once you’ve SSH-ed in, you’ll find yourself in an Ubuntu Linux filesystem environment.
picloud@ip-10-46-223-4:~$ ls / bin boot dev etc home lib lib64 media mnt opt proc root sbin selinux srv sys tmp usr var
Your current working directory is /home/picloud:
picloud@ip-10-46-223-4:~$ pwd /home/picloud
You can verify the distribution of Ubuntu you’re using:
picloud@ip-10-46-223-4:~$ cat /etc/issue Ubuntu 11.04 \n \l
We give you sudo access so that you have the freedom to install anything anywhere.
# this does not produce an error picloud@ip-10-46-223-4:~$ sudo touch /root/i_can_be_root
The owner and group for files and directories in your environment do not matter. While you’ll be using the picloud and root user accounts, your jobs will be run with a different user account that will have access to the entire filesystem environment.
For the Python example, we’ll use sudo access to install the ObsPy library using pip.
picloud@ip-10-46-223-4:~$ sudo pip install obspy.core obspy.signal Downloading/unpacking obspy.core Downloading obspy.core-0.4.8.zip (186Kb): 186Kb downloaded Running setup.py egg_info for package obspy.core ... # output shortened for brevity Successfully installed obspy.core obspy.signal Cleaning up...
For the shell example, we’ll use sudo access to install imagemagick using apt-get.
picloud@ip-10-46-223-4:~$ sudo apt-get install imagemagick Reading package lists... Done Building dependency tree Reading state information... Done ... Setting up netpbm (2:10.0-15) ... Setting up gs-cjk-resource (1.20100103-3) ... Setting up libgs9 (9.05~dfsg-0ubuntu4.2) ... Setting up ghostscript (9.05~dfsg-0ubuntu4.2) ... Processing triggers for libc-bin ... ldconfig deferred processing now taking place
When you click the “save” icon, your SSH connection to the setup server will be closed. The length of time it takes to save your environment depends on how much you’ve installed. Once saving has completed, your jobs can start using the environment, and you can also SSH back into the setup server to make additional modifications. If you are finished making changes to your environment, or wish to discard the changes you’ve made since the last “save” request, simply click the “shutdown” icon. You can also perform both the save and shutdown operations by clicking the “save & shutdown” icon.
Please shutdown the setup server if you aren’t using it. It costs us money to keep it up for you, and we automatically terminate it after 8 hours.
You may need to modify an existing environment in order to fix mistakes, install additional dependencies, or update packages you’ve already installed. Locate your environment in the “Your environments” section of the Environments Dashboard, and click on the “modify” icon. A setup server for the environment will be prepared for you.
If you feel more comfortable using the command line, or if you wish to automate the management of environments through scripts, the Environment Dashboard functionality is also available through picloud.
To create an environment called sample_env using the Ubuntu Precise 12.04 base:
$ picloud env list-bases name distro python_version maverick Ubuntu Maverick 10.10 2.6 natty Ubuntu Natty 11.04 2.7 precise Ubuntu Precise 12.04 2.7 $ picloud env create sample_env precise -d 'Initial environment for testing' ec2-50-16-29-225.compute-1.amazonaws.com
Provide the create command with the environment name and the name of the base environment you wish to use. Pass the -d flag to set a description for the environment. The create command returns the hostname of the setup server where the sample_env environment can be modified.
You can connect to the setup server by invoking:
$ picloud env ssh sample_env Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-38-virtual x86_64) Welcome to your Environment Setup Server! picloud@ip-10-12-27-237:~$ sudo apt-get install imagemagick ...
You can also issue shell commands to be run on the setup server.
$ picloud env ssh sample_env pwd /home/picloud $ picloud env ssh sample_env cat /etc/issue Ubuntu 12.04.1 LTS \n \l
If you are using environments to compile and install your own programs, you will need to transfer files between your local machine and the setup server. You can use env rsync for this purpose.
$ ls my_dir file1 file2 $ picloud env rsync my_dir sample_env:/home/picloud/ sending incremental file list my_dir/ my_dir/file1 my_dir/file2 sent 152 bytes received 54 bytes 82.40 bytes/sec total size is 0 speedup is 0.00 $ picloud env ssh sample_env ls -R .: my_dir ./my_dir: file1 file2
Note that the syntax for env rsync is modeled after that of the real rsync program, except the environment name is used in place of username@hostname. You can also pull files from the setup server to your local machine by specifying the environment path (e.g. “sample_env:/home/picloud/my_dir”) as the source.
As you automate the process of creating or updating your custom environments, it is likely you will encapsulate the necessary sequence of commands into scripts. The env module offers a convenience wrapper for this purpose that copies a local script file to the setup server, executes it, and displays the output.
$ cat <<END > my_script > #!/bin/bash > sudo pip install obspy.core obspy.signal > sudo apt-get install -y imagemagick # -y ensures it'll run without interaction > END $ picloud env run-script sample_env my_script Downloading/unpacking obspy.core # start of output from pip install Downloading obspy.core-0.4.8.zip (186Kb): 186Kb downloaded Running setup.py egg_info for package obspy.core ... Successfully installed obspy.core obspy.signal Cleaning up... Reading package lists... Done # start of output from apt-get install Building dependency tree Reading state information... Done ... Setting up netpbm (2:10.0-15) ... Setting up gs-cjk-resource (1.20100103-3) ... Setting up libgs9 (9.05~dfsg-0ubuntu4.2) ... Setting up ghostscript (9.05~dfsg-0ubuntu4.2) ... Processing triggers for libc-bin ... ldconfig deferred processing now taking place
You can save your environment from both the setup server and your local machine:
# This is on the setup server. It will also log you out of the setup server. # Use "sudo save-shutdown" if setup server should be terminated after saving. picloud@ip-10-12-27-237:~$ sudo save Environment save has been initiated... Connection to ec2-50-16-29-225.compute-1.amazonaws.com closed by remote host.
# This is on your local machine. $ picloud env save sample_env $ picloud env shutdown sample_env # both operations can be done with one command: # picloud env save-shutdown sample_env
To modify an existing environment, use env modify:
$ picloud env list name status action created last_modified sample_env ready idle 2013-04-18 07:17:39 2013-04-19 03:36:24 $ picloud env modify sample_env ec2-54-234-109-217.compute-1.amazonaws.com
The env modify command returns once a setup server has been prepared for your environment modification.
Using an environment with a job is simple. In Python, you use the _env keyword:
>>> def f(): ... import obspy ... return obspy.__path__ ... >>> jid = cloud.call(f, _env='sample_env') >>> cloud.result(jid) ['/usr/local/lib/python2.7/dist-packages/obspy']
Now that we’ve specified to use our sample_env, the job runs without error.
In the shell, use the -e flag:
$ picloud exec -e sample_env convert -version [jid] $ picloud result [jid] Version: ImageMagick 6.6.9-7 2012-08-17 Q16 http://www.imagemagick.org Copyright: Copyright (C) 1999-2011 ImageMagick Studio LLC Features: OpenMP