Hosting your app(s) - what are your options and how to choose one?
What are the different options of hosting your app(s) and the upsides/downsides of each one.
So you've come up with a great idea for an app or maybe would want your own instance of something like Ghost for a blog with Plausible for analytics. Great! Now let's talk about putting it up somewhere for other people to access it.
How to do it? What is the most optimal choice - considering ease of use, cost vs performance, time to get it up and running? There are many things to consider and questions you should ask yourself before choosing. For example:
- How many applications would you want to host?
- What are the options of setting up those apps? Is it your own app which code will be hosted on e.g. Github or maybe an established one like Ghost with multiple options of setting it up and lots of documentation around the process?
- What is the expected load for each app?
- Do the apps share dependencies like using the same RDBMS or KV store?
- How much do you wish to pay to host those? Or maybe it should be free, at least to some degree?
- What kind of reliability and downtime do you expect and accept?
- Where are the users? Do you require sharding and global hosting options to reduce signal travel overhead?
Some of those questions might be almost impossible to answer at first (e.g. expected load is a pretty common one for startups) but not knowing the answer to them tells us a lot also. One of the biggest selling points of the "Cloud" is that it helps expecting the unexpected - scaling up or down when needed, letting you quickly get up-and-running a required service, etc.
Of course the above comes at a price - in reality you pay a hefty premium for someone else to handle all of the hurdles and having at hand an practically infinite amount of resources to scale up your service(s). But for most people (and companies too, even though not many can accept it) this is not really required and exponential growth is not (and never will be) a realistic problem they need to handle.
The biggest advantage of cloud-type services vs hosting and managing them yourself on your hardware is reliability - this is especially true for critical stuff like databases which hold e.g. customer data where losing those is the end of the line for many. Properly hosting, managing, upgrading and scaling something like Postgres is hard and requires experience - thus it commands a steep premium which explains (at least to some degree) the prices of managed RDBMs. The data and services are usually redundant so even if the data center you're currently using gets blown up, it will (in theory at least) be quickly put back up somewhere else for you to use with minimal downtime. This kind of SLA isn't cheap, but sometimes critical to many businesses.
The Cloud is a very generic term and comes in many flavors - different degrees of control and guarantees that are provided so we can choose a solution that is best tailored to our needs.
Although the Cloud has many advantages it may be that you don't need them that much, along with the costs associated. Or maybe you simply need the data and software to be completely controlled by you or your company. Then you have the options to host and manage everything yourself - including choosing and buying the hardware on which it will all be running, down to the last screw. This of course is more complicated than delegating it to someone else and adds quite a bit of overhead. But many businesses and hobbyists use this model with great success.
So thinking about the stuff mentioned above is quite fundamental to not encounter roadblocks in the future when you'll need to handle more traffic or would want to add more apps/services. Fortunately a lot of current infrastructure is built around dynamic expansion and change which helps alleviate the issues, at least a bit.
On-premise vs. Cloud
The first choice is to either go on-premise or some form of Cloud (VPS, SaaS, IaaS, PaaS - these will be explained further later on). The fundamental difference between both approaches is quite simple - control over the hardware and software.
In the on-premise approach you control the hardware that is being used (the CPU(s), hard drives, memory, etc.) and the software starting from the OS or even motherboard firmware. This can come in two forms - either you directly own and manage the hardware or lease it from someone. In both scenarios you also have the option to put the hardware into some location that is tailored for this kind of stuff where it will sit with hardware of other people, managed by someone else - colocation.
One thing to note - colocation can actually be cheaper for bigger operations because of the scale factor. Proper storage and maintenance of server hardware is complicated and doesn't come cheap so it very well may be that doing it yourself (properly!) is more costly than handing over that to someone else who does that for many other people.
The Cloud was mentioned previously a bit but being honest there is no concrete definition as to what can be called "the Cloud" or not -there's a very broad spectrum of services and functionality that fall under this term with the single common denominator between them being that someone else owns and manages the hardware and software. A lot of stuff stops being our problem and is shoved onto someone else. For a price, of course.
We will delve into the different types and flavors of the Cloud as even choosing that route can be quite daunting with the flood of offerings that have appeared over the years. I will try to explain the main points of each variant and the strengths/weaknesses of each.
VPS, IaaS, PaaS and SaaS
We can divide Cloud hosting and services into four main categories listed above, depending mostly on the level of control we have over the software stack and process.
VPS (Virtual Private Server)
A VPS is simply an instance of a VM (Virtual Machine) that is hosted and managed by some entity on shared hardware. What you get is usually the choice of the OS installed and a basic panel for managing stuff like IP addresses, SSH keys, firewalls, backups, etc.
It's almost identical to having the system installed by us on our computer just that we cannot access it directly but communicate with it via SSH - with maybe some option to do a "hard reset" or power down via the admin panel. The consequences of this are pretty straightforward - we have almost complete control over what gets installed and ran but we need to setup it from the ground up, which may be quite a daunting task for someone who has no to little experience as a sysadmin or even power user. Also as the hardware is being shared the amount of resources we have available may vary in time quite a bit making it more unpredictable than having dedicated hardware.
The biggest upside, aside from having a great deal of control, is the cost - on average the cost of especially compute and bandwidth is much lower than in PaaS or even IaaS. And an other key difference, tied to its limitations, is that the cost is (almost) flat - we have a more or less constant amount each month to pay and the additional ones are more predictable than compared to something like AWS.
Example VPS providers:
- OVHcloud
- DigitalOcean
- Linode
- Hetzner
IaaS (Infrastructure as a Service)
There are similarities between VPS and IaaS in that we buy a certain amount of compute time and storage managed by a 3rd party. But IaaS aims at solving two issues which are hard to do with a regular VPS:
- Easy, low-overhead and fast scaling of resources available
- Removal of the overhead tied to managing an OS - upgrades, patches, failure handling, etc.
What we operate on in an IaaS is not a system(s) hosted on some VMs with a set amount of resources available to it/them but on a pool of resources and infrastructure - compute, storage, networking and security management (and the list goes on).
The goal here is elasticity - you get as much as you want, when you need it with little hassle. You need 100 compute nodes to run a quarterly Spark job to process some reports? A few clicks and it's available. A few hundred terabytes to store all of that? Sure.
This is especially handy when you have spikes in usage - maybe some compute-heavy ETL job(s) ran from time to time or your website sells holiday products and during that period it sees very big increase in traffic. Thanks to using IaaS you only use (and pay) what you need - with additional guarantees that it will work when you'll need it. Of course a premium is paid for these features - scalability and reliability isn't given to us for free.
Example IaaS providers:
- AWS (EC2)
- Microsoft Azure
- Google Copute Engine (GCE)
- Rackspace
PaaS (Platform as a Service)
We can see it as an extension of the idea above - reducing overhead of running software by moving the management of the infrastructure from the user and giving him easy ways to scale up as his needs grow.
PaaS takes this further by providing managed services like databases, analytics, scheduled execution, monitoring, etc. and packing it all up into a single, coherent offering. The idea is to make development and provisioning as easy as possible, removing the overhead and effort needed to manage that. This makes bootstrapping applications and multi-application solutions easier, but of course we need to pay a premium for all this red carpet that is given to us.
Even though the cost is steep looking at "raw numbers" of what we're getting in terms of hardware resources there is a big interest in these kind of offerings, mainly by solo developers and small teams (with many startups as part of that). The reasons we've outlined above - on average it is much easier to run and manage our solution on a PaaS than it is compared to IaaS or even more so a VPS.
The biggest downside (aside from the costs) in my opinion is the risk of lock-in and being too reliant on "how things are done" according to the people who designed the platform. This makes moving somewhere else harder as porting platform-specific logic can be a very big task, sometimes impossible without serious rewrites and rethinking of the whole architecture. But if you're in the MVP stage trying to get off the ground or doing some hobby project thinking too much about these kind of problems might be a waste of time better spent elsewhere.
Example PaaS providers:
- Heroku
- Render
- Fly.io
- Google App Engine
- AWS Elastic Beanstalk
SaaS (Software as a Service)
This can be considered the highest level of abstraction over the infrastructure - here the 3rd party manages the whole thing to provide us the service we require. All we do at most is provide some high-level configuration details and it should be ready to go. It requires almost 0 time and overhead to start using it.
SaaS is most probably the easiest to start using but it'll cost us the most - a SaaS offering is usually a business in itself and for the extreme (compared to the previous approaches) ease of usage we will pay a premium. But in return what we get is (usually) great reliability and availability, support, SLAs in higher packages and complete removal of the maintanace hassle from us. Create an account, pay and start using it - the end.
The downsides are similar to those in PaaS which are - the cost and lock-in. Being locked into a SaaS can be especially tricky and migration might require really large rewrites - e.g. going away from Mailchimp to generic self-hosted SMTP.
Some of the SaaS offerings have an open-source variant for self-hosting which can also be taken into consideration - we will have to do it all ourselves but it's possible we'll save a lot in operational costs in the long run. But here each case differs greatly depending on the usage scenario and the SaaS to be used. As an example the website you're now visiting runs on a self-hosted instance of Ghost.
What should I choose?
So we're nearing the end and there is no clear answer as to which one is the best. The reason for that is that there isn't one - as mentioned in the beginning this choice is very dependent upon our goals, limits and expectations. We need to know what requirements we have and then start narrowing down the options until we arrive at a handful which look good - then drill down on the specifics and maybe try to find opinions of people who used the given provider. Here we need to be careful of astroturfing and hidden ads - unfortunately some companies choose these kind of underhanded methods.
Here is how I see each one of them and the most important points I consider during evaluating these options either for a personal project or advising someone else:
On Premise
- Full - here it may vary greatly. Either you just want to host your own blog or calendar app and have an old box lying around or you're a tech giant like Google or Amazon where having full ownership of the hardware and software makes the biggest business sense. In the second case you do not need advice, as for the first one you need to be aware of the work needed to be put in. And that attaining good reliability will be very hard if you don't have a lot of knowledge.
- Partial (leased hardware, colocation) - many times a good middle ground between fully owning and managing the hardware and using a cloud solution. Many times the scale at which the provider operates makes it possible for him to lower costs of both purchasing the hardware and operating it. In addition he has a trained team to handle any problems that will pop-up which can be invaluable. For companies that have stable needs and in-house system administration capabilities this may be the option that provides great ROI.
Cloud
- VPS - a good entry that provides great value for the money. If you have the knowledge and patience you can easily host multiple applications at a low cost and scale up vertically and horizontaly when needed. For the ambitious hobbyist.
- IaaS - you have varying needs in regards to the resources needed (e.g. quarterly spikes) and/or need greater reliability while also removing some of the system administration overhead. Either you're growing fast (or plan to) or have spikes of needed resources - needing to scale up fast.
- PaaS - a startup or the hobbyist developer benefit greatly from the ease of use when it comes to PaaS, making it possible to ship the long awaited MVP faster and lowers down the human resource requirements to manage the infrastructure. "git push prod", getting quickly off the ground and agility being the selling points. Just don't stay too long if your business pops off - the cost might give you a heart attack.
- SaaS - you simply need a solution to a specific problem, and you want it now. You have little to no interest in hosting or developing the solution to the problem yourself and you're ready to pay someone else to solve it for you, additionally giving some guarantees that it'll work.
There is also something that could be called a "semi-cloud" solution where we "cloudify" our managed system(s) and infrastructure into an IaaS/PaaS/SaaS. We can create our own IaaS with Kubernetes or a PaaS using something like Dokku or NextCloud - just of course we'll be the ones responsible for setting it up properly and keeping it running. And this is hard, especially at scale.
I hope the post helped someone when it comes to choosing their next platform that they'll use to develop their application(s). General advice - there is no perfect solution and do not get too hanged up choosing one - especially when we're sticking to the layers below PaaS, moving from one to another isn't the end of the world. Focus all that energy into developing what you're trying to host.
Happy coding and hosting!