Client Basics

It should be quite easy to start using PiCloud. If any of the below does not work as you expect, please see the cloud pitfalls section.

Some quick terms:

  • PiCloud refers to the cloud platform solution operated by PiCloud, Inc..
  • Client refers to any computer communicating with PiCloud. This computer is typically an individual’s personal computer, but it may be a server in another sense (e.g. a webserver that interacts with PiCloud).
  • cloud is a python module that allows the client to run arbitrary code on PiCloud's servers.
_images/basic_pi.PNG

System Requirements

Python 2.5, Python 2.6 or Python 3.1 with a correctly installed cloud module.

Note

Jobs sent to PiCloud when using a Python 2.x client will be run under Python 2.6. Jobs sent using a Python 3.x client will be run under Python 3.1.

Running a Job

cloud.call() allows the client to run functions on PiCloud. For instance, a function named add can be run on the PiCloud servers merely by invoking cloud.call(add). Arguments, e.g. add(1,2) can be passed by listing them after the function, e.g. cloud.call(add,1,2) or by naming them, e.g express add(x=1, y=2) as cloud.call(add, x=1, y=2). After add completes, its return value is saved and can be accessed via cloud.result().

Upon invoking cloud.call(), the PiCloud server will create a job which runs the desired function. cloud.call() is non-blocking; its return value is an integer jid (Job IDentification). This jid can be used to access information about the job through the cloud module, as well as the PiCloud web interface. The below diagram represents this example:

_images/basic_pi_call.PNG

Because cloud.call() is non-blocking, all jobs run on PiCloud’s servers in parallel with each other and the client. As jobs run, the client can continue to create more jobs with cloud.call(). Consequently, cloud.call() allows for easily realized course-grain parallelism.

Functions that run on the PiCloud servers can do potentially anything. They can even open up arbitrary connections to download documents, access databases, post data to websites, etc.

All relevant information needed to execute your function (code, global variables, class variables, etc.) is transmitted to the PiCloud server. Most users will find that any function they’ve written can be passed through cloud.call(). See the Limitations section for more information.

Consider using cloud.map if you are generating many jobs in parallel with the same function, but different arguments.

Warning

There is some overhead on cloud calls. For PiCloud to be worthwhile, your function should take at least a tenth of a second to execute. See the PiCloud examples section for proper design patterns.

Note

cloud.call() may be invoked within a function running on PiCloud, allowing jobs to generate new jobs.

Accessing Job Status

To access the status of the created job, you will need the jid returned by cloud.call(). Job meta-data is purged very infrequently by PiCloud, so the jid can be placed safely into a database and checked days later.

Use cloud.status(jid), where jid was returned by an earlier invoked cloud.call() to get a job’s status. Like every other job-related function, cloud.status() also accepts a sequence of jids; if given a sequence, a list of statuses is returned corresponding to the requested jids.

The possible statuses are:

Status Meaning
queued Job is in a queue on the server waiting to be run.
processing Job is running.
waiting Job is waiting until its dependencies are satisfied.
done Job completed successfully.
error Job errored (typically due to an uncaught exception).
killed Job was aborted by the user.
stalled Job will not run due to a dependency erroring.

Blocking until Job finishes

Use cloud.join(jid) to block until the job with jid jid has a “complete” status (done, error, killed, or stalled). If the job errors, a cloud.CloudException will be thrown with a traceback indicating what went wrong on the server.

cloud.join() will also accept a sequence of jids and will block until all jobs complete. If multiple jobs throw an exception, the exception thrown will describe the first (in sequence order) job that errored.

An optional timeout may be set. If the join takes longer than timeout seconds, it will abort by throwing a cloud.CloudTimeoutError.

Accessing Return Value

Use cloud.result() function to access the return value of the function that ran on PiCloud. This function blocks (with cloud.join()) until the job has completed. Like cloud.join(), any error will result in an exception being thrown. The below diagram represents the earlier described case of add:

_images/basic_pi_result.PNG

cloud.result() also accepts a sequence of jids; if given a sequence, a list of return values is returned corresponding to the requested jids.

As with join, a timeout may be set.

Note

As functions are allowed to open connections, it is acceptable to not have a return value. For instance, your function might read from your database, perform some heavy computation, and then write back to your database.

Mapping

cloud.map() mimics the built-in python map function. The basic built-in map function is:

added2 = map(lambda x: x+2, an_iterator)

Which is equivalent to:

added2 = [x+2 for x in an_iterator]

In other words, newlist = map(func,sequence) will return a list where newlist[i] = func(sequence[i]).

cloud.map() is designed for both ease-of-use and speed when applying the same function to a list of data. One job is created per element of an_iterator. Client to PiCloud overhead is minimized by using cloud.map() in lieu of a for loop generating multiple cloud.call().

The return value of cloud.map() is an ordered sequence of jids where jid[i] corresponds to func(sequence[i]). As mentioned earlier, this sequence can be passed to cloud.status(), cloud.join(), and cloud.result().

The below diagram shows what happens when cloud.map(square,[2,3]) is called:

_images/basic_pi_map.PNG

One can also make more complex map calls, such as:

products = cloud.map(lambda x,y: x*y, xlist, ylist)

cloud.iresult may come in handy if you wish to iterate through the results of cloud.map().

Note

If xlist and ylist are different lengths, cloud.map() will increase the argument lists to the maximum length of the passed lists, inserting None when needed.

High CPU

By default, PiCloud will assign jobs one Amazon EC2 compute unit, the CPU capacity of a 1.0 to 1.2 GHz 2007 Xeon processor. If your job is CPU bound (that is does not spend most of its time waiting on I/O or sleeping), you can speed it up by granting it more compute units. Set the _high_cpu keyword argument to True within the arguments of cloud.call() or cloud.map() to assign jobs 2.5 compute units. Note that while higher computation rates apply, you may pay about the same per CPU-bound job as the job will finish significantly faster.

Jobs marked _high_cpu will have a higher RAM limit of 1.7 GB, rather than the standard 850 MB.

Example:

cloud.call(foo,_high_cpu=True)  #foo will be assigned 2.5 compute units of CPU power
cloud.map(lambda x,y: x*y, xlist, ylist,_high_cpu=True)  #each job produced by this map will be assigned 2.5 compute units of CPU power

cloud.call(foo,_high_cpu=False)  #This is the same as not specifying _high_cpu. foo receives 1 compute unit of CPU power

Note

When PiCloud has slack capacity, 1 compute unit jobs may receive extra compute power, sometimes as much as 2.5 compute units. _high_cpu jobs, however, are guaranteed to receive 2.5 compute units of power.

Cloud Files

The before described programming framework is powerful, but sometimes it is necessary to read and write large amounts of data. For instance, you may have data files that will be used by all future jobs; it makes no sense to send such files at every cloud.call().

Instead, you should use cloud.files, a module that provides a file storage and retrieval interface that can be used both on the client and PiCloud.

Internally, these files are stored within Amazon S3 buckets managed by PiCloud. This system is effectively a key-value store that maps a file name to the file’s data. The keys are not paths; while you may put the ‘/’ character in a filename, directories do not per se exist.

The cloud.files interface is quite simple:

Example:

#This code below can run both locally or in a job running on Picloud
cloud.files.put('names.txt')  #put names on the Cloud
cloud.files.get('names.txt','names2.txt')  #retrieve names.txt from the Cloud and store it as names2.txt
cloud.files.delete('names.txt')  #remove file

Several other functions exist to manage the stored files. Be sure to read the detailed documentation about the cloud.files interface.

Files can be up to 5 GB in size. Note that you are charged a monthly fee for storage.

Simulation

PiCloud offers a simulator to run your PiCloud code locally. You may find that debugging is easier in simulation. To enable the simulator, run:

cloud.start_simulator()

Or edit cloudconf.py, and set use_simulator to True.

For more information on the simulator, see the advanced section.

Table Of Contents

Previous topic

Quickstart Guide

Next topic

Examples