This post is intendended to be a somewhat practical response to AWS doesn’t make sense for scientific computing. We are considering computationally intensive long-running workloads with a deep queue but reasonably easy coarse-grain parallelism. Should you build your own infrastructure or use a cloud provider like AWS?
Preamble – spot instance
Firstly if you have a queue of work to be done very likely you want to use spot cloud instances, which for example for c6a.24xlarge is currently around $700/month. This is significantly cheaper than even the reserved instances.
Only consider own facility once the steady workload is proven in cloud
Get going in the cloud. Once there is a stable workload in the cloud you can analyse the savings of an own facility. There is no point building a facility for an imaginary workflow.
Keep the data in cloud
It is not efficient to shuffle data back and forth between cloud and own facility. Once you are processing in the cloud, keep the data there if at all possible.
AWS Glacier deep storage is about $12 / (TB-year).
Key person risk
It does not feel nice to be tied to a cloud provider, but in your own facility you may be tied to a key person. In small and medium sized facilities loosing a few key people can cause huge problems.
Even a short meeting within an organisation can easily cost $5,000 in peoples time preparing and attending. How many meetings does a local facility require?
Electricity and HVAC maintenance
How much down time is needed for maintenace of the electricity and HVAC subsystems?
Calculate around 7%-10% of cost of the building for annual amortisation/depreciation/maintenance. If you’re not being billed for it now, are you sure you won’t be during the time your capital investment needs to pay off?
Cost of things going wrong
If there is a serius problem with the local facility (long-term downtime, design fault, fire, etc) how will it impact your organisation? Can you absorb these low-likelihood high-impact scenarios?
Are you business of computing?
If your business is focused on something else, do you really want to be investing capital and time in creating an advantage in computing?