The modern web is a vast, ever-expanding network of interconnected machines and devices. We tend not to think of how all this works whilst we are Instagramming, tweeting or giving our followers a status update. You type a website address, and you (hopefully) get the page you wanted. But what sits beneath a website, and how does it work?
We’re a curious bunch here at Fresh Egg, and we’re all about sharing our knowledge. I have been a developer for many years and, I have come to realise, there's still a lot of confusion amongst some clients with regards to the technologies that, in combination, make websites, and the internet, work. All websites work on the same principle, no matter how big or small - your own website, as well as those hilarious cats that are able to dance about your screen on Facebook. In this blog post, I'll walk the stack and identify key technologies that all contribute to the display of a web page - the browser, content delivery network, hosting provider, firewall, load balancer, server, web application and database.
1. The Browser
We all have our favourite – from Chrome (62% of market share) right down to Internet Explorer (6.3%) – but what actually is a browser?
First and foremost, when you visit a website, you’re doing so through a web browser (a user agent). The browser is responsible for downloading content from a remote server and displaying that content to the visitor.
This content is in the form of an HTML page. HTML (which stands for Hypertext Markup Language - arrrrrggghhh that's what that means) and describes the structure of content to be displayed on a page. It is the responsibility of the browser (hello Google Chrome, Safari, Firefox) to interpret what that HTML means, and show that to the user - this is called rendering.
“Great – so where does this content come from, for the browser to be able to download?” I hear you cry?
2. The Content Delivery Network (CDN)
Whether you know it or not, you probably interact with a CDN on a daily basis, from all sorts of locations – including your own bed, if you’re one of the 40% of adults who uses the Internet on their phones within five minutes of waking up.
But what is a CDN, and is it judging you for getting your internet fix first thing?
Put simply, a CDN’s mission is to virtually shorten the physical distance between you, the end user, and the content your browser is trying to download from the original website’s server. Remember the ‘here’s one I made earlier’ trick on Blue Peter? This is the job of the CDN. It’s the middleman between you and the origin server, serving you a cached copy of the content you’re requesting, so you can see it nice and quickly. Content doesn't start life at the CDN though, it has to be read from wherever the website is hosted...
3. The Hosting Provider
Everything has to run from somewhere and this means a hosting provider. A hosting provider is responsible for the underlying server (or servers) that runs a website, and any supporting software.
Hosting providers come in various shapes and sizes from small independents to larger infrastructure-endowed providers like WireHive and Rackspace and all the way up to the so-called hyper-cloud providers of Amazon AWS, Microsoft Azure, Google Cloud and IBM Cloud. I like things nice and tidy, so I put these into bands: band A (independents), band B (infrastructure-endowed) and band C (hyper-cloud).
What’s the difference?
The hosting requirements of a website are largely driven by what the purpose of the website is, and the volume of traffic they typically handle.
Small websites, such as static websites or brochure-ware websites running on a lightweight Content Management System (CMS) may opt for an independent, or low-cost hosting solution (band A). Larger websites and quite often transactional websites such as e-commerce platforms require more from their hosting infrastructure so might opt for a band B or C hosting platform.
Many of the band B hosting providers provide their own 'cloud' infrastructure. Cloud is a very wide term in the modern Internet age, but in terms of hosting it generally means the ability to manage a set of virtual servers running in a data centre somewhere.
Hyper-cloud hosting providers (band C) are a relatively new class. Starting with Amazon when they launched their first cloud services, the Amazon Elastic Compute Cloud (Amazon EC2) in 2006, a number of these platforms have been launched, such as Microsoft Azure. Hyper-clouds provide a vast amount of online services (something-as-a-Service, *SaaS) for managing many aspects of digital networks and web applications, from basic content hosting, all the way to elastic, scalable databases that dynamically respond to demand.
Whatever your hosting provider, the content flows through their network to their target...
4. The Firewall
It sounds like a protective barrier, and that’s exactly what it is. A firewall is often put in place to protect your hosting infrastructure against potential attacks, or to control access to resources. Firewalls inspect both inbound and outbound traffic to the hosting infrastructure, or a specific server, and work with rules. These rules govern how the firewall reacts when inspecting traffic. A common rule on most firewalls could be to allow TCP traffic on port 80 (the default port for HTTP traffic). For example, you may have specialised rules in place to ensure that only your organisation can access your CMS, or perhaps to only allow limited access to your UAT server.
Firewalls can either be hardware or software. Typically most operating systems (such as Windows) come with a built-in software firewall. Hardware firewalls are devices that can be installed into a network and generally provide direct compute power for analysing network traffic.
As a minimum, it is recommended to have at least a software firewall in place, but depending on your needs, you may consider if a hardware firewall might be right for your website.
Next, we're nearing our destination...
5. The Load Balancer
A load balancer is a specialised device that’s designed to balance incoming traffic across potentially multiple servers, sitting between the client and server farm. Whilst they’re predominantly used for larger complex websites where there may be a need to run the website on multiple servers to handle the volume of traffic, a common strategy is to use a load balancer for smaller websites working with a single server for future scalability.
But what does it do?
A load balancer can determine which server should handle a particular request, and quite often can provide alerts and/or react to when a server becomes unavailable. A load balancer will use a strategy, such as a round-robin (requests are handled by your different servers sequentially), or least connections (the server most available wins). There are numerous strategies that can be employed, and each suits different workloads.
Using a load balancer for a small website with a single server means you can expand in future without having to make any major changes to your DNS configuration (such as changing IP addresses), and also allows you to react quickly to changes in network traffic.
From the load balancer, we land at our destination...
6. The Server
The mothership, the brain, the hub. The server is the host computer that actually runs your website (a web application). A server could be virtual (i.e. running virtualised on top of a hardware host), or hardware itself (often called a bare-metal server). There are different pros and cons to both approaches, but if often comes down to cost and scalability.
A virtual server has the benefit of essentially being software, which means we can adjust how many resources are given to it to be able to run your website. Virtual servers can often be spun up a lot quicker than provisioning a new bare-metal server, and in terms of reacting elastically to increases in demand, they’re a lot easier to manage. For the hyper-clouds (band C), these environments are virtual, and allows cloud environments to ramp up and down in accordance with your traffic.
A bare-metal server can generally give your better outright performance as you are executing software directly on the physical hardware, which means better read/write performance from the disks, better usage of memory, etc.
Your hosting provider may or may not offer both approaches and it’s worth exploring what is available. Conversations with your developers, your IT team, and if available, a consultant from your hosting provider, will help you determine which sort of environment you need. They are best placed to understand what your website may need to run smoothly. Also having your analytics team on standby at this point to give insight in expected traffic will be extremely beneficial in the long run.
That’s all very well, but how does your website actually run on the server? Let’s drill a little deeper...
7. The Web Application
Your website is fundamentally a web application. Web applications come in all shapes and sizes. A lot of web applications are built on top of a content management system (CMS), such as Kentico. A CMS provides the building blocks for your website, including management of content pages (such as your homepage, or blog), and media items (images and videos). For more bespoke use cases, it might be that the application is built entirely standalone without using a CMS, but often for websites that won’t need regular updating or scaling.
The web application – your website – has to handle incoming requests from users and give responses (in the form of HTML pages, or other formats like JSON, images, etc.). How your application is executed can vary greatly, but generally it means running on top of a specialised software component known as a web server (not to be confused with the server described above, which is the host machine with an operating system such as Windows, Linux, etc.).
A web server is responsible for accepting requests and passing them to your application to be handled. Common web servers in use today include Microsoft IIS (on Windows), or Apache, Tomcat or Nginx on Linux-based platforms, with Apache the most popular web server in use today.
Your web application has a big influence on what web server you need. For PHP-powered web applications (such as Wordpress or Joomla), these can run on any of the web servers listed above, whereas ASP.NET applications are limited to running on Microsoft IIS. The newer generation ASP.NET Core platform is both cross-platform and self-hosted (the web application itself is a web server).
Quite often when serving HTML from a web application, it reads this information from a database...
8. The Database
The database is a central store for data. Data could be anything from page content, or a user record, a media item, etc. The database is your one source-of-truth for all of the data for your web application, so it's important that this is both secure, and backed-up regularly.
Databases run on a database server, and this may or may not be on the same server as your web application. It's quite often the case that the database will run separately to the web application as you separate the concerns of each environment.
There are also many different types of database servers, from relational databases (data is organised into table structures linked with keys) such as MySql or Microsoft Sql Server, document databases (NoSQL databases) such as MongoDB or CouchDB, and graph databases. Each database type has different use cases, and your developers will select the appropriate database based on their understanding of what the web application requires.
The end of the line! How to complete the circuit? Once you've travelled all the way through the stack, the web application will generate an appropriate response (hopefully be the page or resource you wanted). This response then flows back to the browser where it is interpreted and displayed.
Although this is a very simplified view of the technology stack that makes up the Internet, hopefully it gives you an interesting* insight into just how many moving parts there are to allow you to use the web through your browser. This is largely the same for other types of applications, such as mobile, social networks and apps.
*level of interest dependant on if you’re a web geek like us, of course.
All of this make sense? Any questions at all? Drop us a line and me, or one of the team will be more than happy to have a chat.