Realtime Cores

Using realtime cores, you can allocate the exact number of cores you need for your computation by the hour. This section will compare how your jobs are scheduled with and without realtime cores.

Job Scheduling without Realtime

When you create a job, your job enters a global queue shared by all of our users. When there is available compute power, your job is dequeued, and placed onto the available node.

We do not have a set number of total cores in our system, and there are no set number of cores per user. Instead, PiCloud continually and automatically estimates the workload we have and will have in the next hour. There are two components to this estimation:

  1. Periodic Jobs: A large class of users runs jobs periodically, whether it’s every minute, hour, or day. Over time our system learns the amount of load each of these users contributes, and increases server capacity in anticipation of periodic jobs.
  2. Aperiodic Jobs: For users who do not have a predictable pattern, we scale our worker nodes once their jobs have been added to our queue. The number of workers we bring up depends on how long we estimate their jobs will take.

How do we estimate the runtime of a job?

We estimate the runtime of enqueued jobs by the average and variance of runtimes of similar jobs. A similar job in Python is one that is executed with the same function, code version, and reserved keyword arguments. A similar job in the shell is one that is executed with the same command, and flags.

How fast will your workload be completed?

Since we use Amazon Web Service’s servers, we are charged for every hour we have a server up. To manage our costs, we try to make sure our servers spend as much time as possible processing jobs, rather than sitting idle.

Here’s a rule of thumb for how this affects you: If each of your jobs takes an hour to process, PiCloud will automatically scale such that all of your jobs are running simultaneously in parallel. We’re happy to bring up as many as you need since you’re keeping our servers busy for the full hour increment that we rent from Amazon.

If each of your jobs takes a half hour, we’ll run about half of your jobs in parallel at a time, so that each core will be busy for the full hour running two jobs each. In practice, our scaling algorithm is more liberal than this, but this should give you a good conservative ballpark.

If you’re just running a small batch of short jobs, you’ll generally get less than 10 cores. If you’re running a large number of long-running jobs, then using the above algorithm, we could potentially bring up hundreds of cores for you.

Why Realtime

The model of our standard service is ideal for users trying out PiCloud for the first time, users with relaxed response-time requirements, and users with large batch jobs. Our realtime service is for our users who:

  1. Need N cores to be ready at a moment’s notice, even if that means those cores will be sitting idle at times. This is common for web applications that are offloading workloads to PiCloud, and need them completed immediately.
  2. Are jump starting a large batch workload. If you find our predictive algorithms to be too conservative, then use Realtime Cores to boost the number of cores available for your jobs. In this case, after the jobs have been scheduled, you should release your cores manually, or set the “max duration” to the minimum of 1 hour.

You do not need Realtime Cores once your jobs have been scheduled and are processing. Realtime Cores only affect job scheduling. Releasing cores will not adversely affect any running jobs.

How it Works

You tell us the type of core you want, and the number you want, and we’ll bring them online just for you behind the scenes. The rule when using realtime is simple: Having N realtime cores active guarantees that you will be able to run N (cores of) jobs in parallel.

Your jobs will be placed in both your own “realtime” queue, as well as our global queue.

Example: If you have reserved 80 cores, and you check in 200 (1 core) jobs, at least 80 jobs will be immediately be dequeued and begin processing on your realtime cores. Potentially more than 80 may run simultaneously since you still have access to the cores available in our standard service.

Allocating Cores

The easiest way to allocate cores is by using the Web Dashboard. It can also be done programmatically.

In Python, you can allocate 160 f2 cores for 24 hours.

>>> cloud.realtime.request('f2', 160, max_duration=24)

max_duration is optional. If omitted, the cores will be held indefinitely, until release is called:

>>> cloud.realtime.release(request_id)

In the shell,

$ picloud realtime request --max-duration 24 f2 160
$ picloud realtime release request_id

Allocation Time

Cores are provisioned within 15 minutes. An e-mail is sent once provisioning has completed.

Allocation Limits

As a safety measure, by default, you may provision up to 5,000 cores at one time using our realtime service. To provision more, please file a support ticket first.

Pricing

The cost of having realtime cores active over a given hour is dependent on the amount of computation you run during that hour. In general if you achieve ~60% utilization during that hour, there is no added cost to using realtime cores. If you do not achieve sufficient utilization, you may be charged a minimum hourly charge as detailed on the realtime pricing page.

Hypothetically, if you reserved a single realtime core at a cost minimum of $0.10/hour, then after an hour you would be charged the greater of $0.10 or the bill accrued by your jobs’ computation. Examples:

  • If you don’t run any jobs during the hour, you are assessed a realtime charge of $0.10.
  • If you run $0.13 worth of computation, you only pay $0.13; no realtime charges are assessed.
  • If you run $0.07 worth of computation, you are assessed $0.03 of realtime charges to meet the net cost minimum of $0.10.

FAQ

Here we highlight questions users have asked about our realtime service. Please read the above sections first to familiarize yourself with realtime cores.

What are the differences between jobs scheduled with realtime and global queue?

None. Realtime cores are nothing more than a scheduling guarantee that states that if you have N realtime cores, N cores of your computation can run in parallel. Whether a job is scheduled with realtime has no effect once it starts processing.

Does realtime provide a discount?

No, jobs scheduled with realtime are assessed the same charges as jobs scheduled with the global queue. For general discounts for high volume and academic users, please see our pricing page.

Does realtime cost additional money?

This depends on how much computation is run while they are active. In general, realtime is free if you achieve 60% utilization over each billable hour. Please see our pricing example above.

What if my realtime is released while jobs are running?

Once jobs are processing, the amount of realtime cores you have does not affect them. Consequently, releasing realtime cores has no effect on running jobs.

What if I reserve realtime while jobs are running?

Once again, having N cores guarantees that N cores of computation can run in parallel. If your realtime cores are activated while jobs are running, the already-running jobs are counted toward that guarantee.

As an example:

  • If you have 20 c2 1-core jobs running and 20 queued, reserving 20 c2 realtime cores will not cause any additional jobs to start processing, as the realtime guarantee of 20 c2 cores is already met.
  • However, if you allocate 40 c2 realtime cores in total, once the cores are active, the guarantee will be 40 c2 cores. Consequently, your 20 queued jobs will immediately start processing - for 40 jobs in total processing.