
Back of envelope calculation of your BigQuery cost
Overview:
It is sometimes important to figure out how much it will cost to run BigQuery and how to optimize the cost. Google has provided tools to estimate how many slots you would need and how to manage cost once you have some workloads (data and processing i.e. queries, reports, dashboards etc.) running on BigQuery.
If you don’t have any data or workloads running and want a ballpark estimate of how much it will cost to run BigQuery in your corporate environment it is not easily available.
Approach:
For a back of envelop calculation of finding a ball park estimate of how much BigQuery would cost on a monthly basis we need the following inputs:
- How much data you have (DBSize)
- How much data you would need to process in a day/month (ProcessedBytePercent)?
We can start with the estimation if we use on-demand pricing.
There are 2 components to the cost:
- Storage cost (assuming you use compressed storage)
- Processing cost.
For storage cost we use the general pricing available here. Which as of this writing is $0.02 per GB/month (StorageCost).
For processing cost we can use the general pricing available here. Which as of this writing is $5 per TB (ProcessingCost).
The TotalCost formula you could use as follows:
TotalCost =
DBSize * StorageCost * CompressedStorageMultiple
+
DBSize * ProcessedBytesPercent * ProcessingCost
The CompressedStorageMultiple you can assume to be 50% as an estimation if you use compressed storage. If you don’t use compressed storage you can ignore this.
To give a concrete example lets say:
DBSize = 1 PetaByte
ProcessedBytesPercent = 20%
StorageCost = $0.02
ProcessingCost = $5/TB
CompressedStorageMultiple = 50%
DaysInMonth = 30
The total cost would be
TotalCost = 1,000,000 (in GB) * $0.02 * .5 (50%)+ 1,000 (in TB) * .2 (20% of bytes processed per day) * $5 * 30 (days in month) = $40,000/Month
Conclusions:
As you can see it is easy to figure out the cost of BigQuery if you find out few data points and make a few reasonable assumptions.