Cloud compute unit costs

Eric Wolak, May 24 2018

One of the best features of Google Cloud Platform’s customer-friendly pricing is that with both Custom Machine Types and Committed Use Discounts, you pay independently for CPU and RAM instead of buying them in fixed ratios. This lets you optimize your cloud footprint to match your actual workload, with no wasted cores or RAM due to a poor fit between your workload and the available instance types. It also exposes the relative cost of CPU vs. RAM, which lets you make informed decisions about compute-storage tradeoffs in your architecture. Alas, other cloud platforms haven’t yet caught up to this approach, but with a bit of math we can work out their effective per-vCPU and per-GB RAM prices. So, what are Amazon Web Service and Microsoft Azure unit prices? Let’s use linear regression machine learning to find out!

NOTE: This is kind of a silly use of the term “machine learning”, but I hope it serves as an example of simple things a software engineer can do using ML tooling.

GCE pricing in us-central1 (per-month on-demand)

term vCPU GB RAM
on-demand $24.22 $3.25
1-year $14.54 $1.95
3-year $10.38 $1.39

That works out to a ratio where one vCPU costs the same as 7.45 GB of RAM. Keep in mind that if you’re running more than 25% of a month, Sustained Use Discounts start kicking in, saving you up to 30% off this list price without any commitment.

Anyway, let’s use a basic linear regression to see what AWS’s per-vCPU and per-GB RAM prices are:

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression

aws_ratecard = pd.DataFrame.from_records([
    # We use instance sizes within their class of approximately the same size to
    # ensure the linear regression weights them evenly.
    ['m5.12xlarge', 48, 173, 192, 2.304],
    ['c5.18xlarge', 72,  278, 144, 3.06],
    ['m4.16xlarge', 64, 188, 256, 3.20],
    ['c4.8xlarge', 36, 132, 60, 1.591],
    ['r4.16xlarge', 64, 195, 488, 4.256],
    ['x1.16xlarge', 64, 174.5, 976, 6.669],
    ['x1e.16xlarge', 64, 179, 1952, 13.344],
], columns=['instance type', 'vCPU', 'ECU', 'GB RAM', '$/hr'], index='instance type')

def pricing_regression(ratecard, cpu_unit='vCPU', print_result=True):
    per_unit_costs = LinearRegression(fit_intercept=False)
    per_unit_costs.fit(ratecard[[cpu_unit, 'GB RAM']], ratecard['$/hr'])
    monthly_costs = per_unit_costs.coef_ * 730
    r2 = per_unit_costs.score(ratecard[['vCPU', 'GB RAM']], ratecard['$/hr'])
    
    cpu, ram, = monthly_costs
    if print_result:
        print("{}: ${:.02f}/mo, RAM: ${:.02f}/GB/mo, ratio: {:.02f}    R²={:.06f}".format(
            cpu_unit, cpu, ram, cpu/ram, r2))
    else:
        return monthly_costs[0], monthly_costs[1], r2

pricing_regression(aws_ratecard)
vCPU: $19.10/mo, RAM: $4.23/GB/mo, ratio: 4.51    R²=0.991502

With an R² better than 0.99, it looks like our model has done a pretty good job of predicting AWS prices, even though we haven’t even included local SSD or network throughput. At $4.23/GB/mo, RAM in AWS is ~30% more expensive than it is in GCP, on average.

If we look at AWS’s previous generation, though, the story gets a bit different:

pricing_regression(aws_ratecard.loc[['m4.16xlarge', 'c4.8xlarge', 'r4.16xlarge']])
vCPU: $25.54/mo, RAM: $2.97/GB/mo, ratio: 8.60    R²=0.995799

A ratio of 8.6 is much closer to GCP’s 7.45, and the model continues to hold up. What if we exclude the (somewhat exotic) X1 instance types?

pricing_regression(aws_ratecard.loc[['m4.16xlarge', 'c4.8xlarge', 'r4.16xlarge', 'c5.18xlarge', 'm5.12xlarge']])
vCPU: $24.81/mo, RAM: $3.04/GB/mo, ratio: 8.17    R²=0.991503

Hm. Not much different. So what we can conclude here is that the X1 instance types give you a ton of RAM, but it doesn’t come cheap.

Of course, if your workload is strictly RAM-limited, then you probably want to be using the instance type with the most GB RAM per vCPU, which would be the X1e types.

x1e = aws_ratecard.loc['x1e.16xlarge']
print("x1e.16xlarge: ${:.02f}/GB/mo".format(x1e['$/hr'] * 730 / x1e['GB RAM']))
x1e.16xlarge: $4.99/GB/mo

So if your workload is truly RAM-limited, then your RAM costs are going to be higher than the norm. That makes some intuitive sense, because you’re probably stranding/wasting some other resources like vCPU, network connectivity, or electricity costs. For comparison, GCP’s Extended memory price, which kicks in on Custom Machine Types past 6.5GB per vCPU, is $0.009550/hr or $6.97/mo. Even though AWS’s average per-GB RAM price is 30% higher than GCP’s, the incremental price for RAM on AWS is actually 30% cheaper than GCP.

ECU

Just for fun, what if we do the same regression as we did at the beginning, but normalize the compute power using ECU, instead of vCPU?

pricing_regression(aws_ratecard, cpu_unit='vCPU')
pricing_regression(aws_ratecard, cpu_unit='ECU')
vCPU: $19.10/mo, RAM: $4.23/GB/mo, ratio: 4.51    R²=0.991502
ECU: $5.53/mo, RAM: $4.38/GB/mo, ratio: 1.26    R²=0.921549

Odd! I would’ve expected the R² to go up when using the normalized CPU measure (ECU), but instead the model actually got worse. Regardless, RAM still looks comparatively expensive on AWS.

Azure

Since we have all this infrastructure built, let’s take a peek at Azure to see how it compares. It’s a bit difficult to model with its variety of instance types and dramatic price difference between CPU generations, but if we stick to just the latest Broadwell parts, we can get a pretty clear result:

azure_ratecard = pd.DataFrame.from_records([
#     ['A2 v2', 2, 4, 20, 0.091]
    ['D2 v3', 2, 8,  50,  0.096, 'XEON ® E5-2673 v4', 'Broadwell'],
    ['D1 v1', 2, 7,  100, 0.146, 'Xeon® E5-2673 v3', 'Haswell'],
    ['E2 v3', 2, 16, 50,  0.133, 'XEON® E5-2673 v4', 'Broadwell'],
    ['F2',    2, 4,  32,  0.10,  'Xeon® E5-2673 v3', 'Haswell'],
], columns=['instance type', 'vCPU', 'GB RAM', 'temp storage', '$/hr', 'CPU', 'Generation'], index='instance type')

pricing_regression(azure_ratecard[azure_ratecard['Generation'] == 'Broadwell'])
vCPU: $21.53/mo, RAM: $3.38/GB/mo, ratio: 6.38    R²=1.000000

Including the previous-generation Haswell instances and the compute-optimized F-series, we get a different picture:

pricing_regression(azure_ratecard)
vCPU: $36.78/mo, RAM: $1.50/GB/mo, ratio: 24.53    R²=0.183072

With an R^2 of just 0.18, something is clearly wrong here. Let’s dig in:

ratecard = azure_ratecard

per_unit_costs = LinearRegression(fit_intercept=False)
per_unit_costs.fit(ratecard[['vCPU', 'GB RAM']], ratecard['$/hr'])
predicted_price = per_unit_costs.predict(ratecard[['vCPU', 'GB RAM']])

df = azure_ratecard.copy()
df['predicted'] = per_unit_costs.predict(ratecard[['vCPU', 'GB RAM']])
df['error'] = (df['predicted'] - df['$/hr']).abs() / df['$/hr']
df = df[['vCPU', 'GB RAM', '$/hr', 'predicted', 'error']]
df.style.format({'error': "{:.2%}"})
vCPU GB RAM $/hr predicted error
instance type
D2 v3 2 8 0.096 0.11721 22.09%
D1 v1 2 7 0.146 0.115156 21.13%
E2 v3 2 16 0.133 0.133641 0.48%
F2 2 4 0.1 0.108994 8.99%

It looks like the D1 v1 instance type is really messing up the regression because it looks basically the same as D2 v3 but costs 50% more. What if we drop it?

pricing_regression(azure_ratecard.drop(['D1 v1']))
vCPU: $29.75/mo, RAM: $2.20/GB/mo, ratio: 13.50    R²=0.824604

That’s more like it! So, to sum up, we have

gcp_cpu, gcp_ram = 24.22, 3.25
aws_cpu, aws_ram, _ = pricing_regression(aws_ratecard, print_result=False)
azure_cpu, azure_ram, _ = pricing_regression(azure_ratecard.drop(['D1 v1']), print_result=False)

df = pd.DataFrame([
    [gcp_cpu, gcp_ram, gcp_cpu / gcp_ram],
    [aws_cpu, aws_ram, aws_cpu / aws_ram],
    [azure_cpu, azure_ram, azure_cpu / azure_ram]
], index=['GCP', 'AWS', 'Azure'], columns=['$/vCPU/month', '$/GB RAM/month', 'ratio'])

df.style.format({'$/vCPU/month': "${:.2f}", '$/GB RAM/month': "${:.2f}", 'ratio': "{:.2f}"})
$/vCPU/month $/GB RAM/month ratio
GCP $24.22 $3.25 7.45
AWS $19.10 $4.23 4.51
Azure $29.75 $2.20 13.50

Conclusion

Even though AWS bundles CPU and RAM into defined instance types, their pricing reveals an underlying unit price for these resources that’s quite consistent across instance types–even the “extra memory” X1 types. On average, RAM in AWS is about 30% more expensive than in GCP, while Azure charges almost 50% less for RAM vs. CPU than GCP.

This means that if your workload skews toward the high end of GB/vCPU, but doesn’t exceed 8GB/core, Azure might be cheaper for you. At the other end, if your workload skews compute-heavy (but not below 2GB/vCPU, where you’d be stranding RAM) you might be better off with AWS’s unit pricing model. GCP sits in between the two, but with its low minimum of 0.9GB/core, it wins for truly compute-constrained workloads by wasting less RAM.