Before this webpage reaches your screen many processes take place in the background. The setup of this website offers free web hosting through AWS S3 and blazing-fast content delivery and loading using Cloudfront and Gatsby. Let us delve into the setup of this website and how its content is delivered to its users.
This blog post goes into the hosting setup of this static website. It is based on this very well-written tutorial by John Sobanski. The diagram1 below describes the final setup of this website. You can follow this tutorial if you would like to build a site like this yourself, which is, therefore, not the purpose of this blog. This blog aims to indulge you in the (if everything goes well) hidden background processes that take place in order for this page to reach your screen. There are two reasons why this would be of interest:
This blog will start with the question What is a static website really? and then explain each service from user to resource:
This website is static. What does that mean? How does that impact our development, deployment, and delivery? Websites can be categorized into one of two categories: Static and dynamic. Whereas pages of static websites are pre-built server-side, dynamic web pages are built upon request by a server.
When you type a domain name into your browser, such as ”www.maartenpoirot.com,” your browser first needs to find out the IP address associated with that domain so it can connect to the server hosting the website. This process is where the DNS (Domain Name System) comes into play.
The DNS is like the internet’s phonebook. It translates human-readable domain names into IP addresses that computers understand. Here’s how it works step by step:
Connection to the Website: Armed with the IP address, your browser can now establish a connection to the server hosting the website content. It sends an HTTP request for the specific webpage you requested (like ”www.maartenpoirot.com/the-how“) to that IP address.
Content Delivery: The server receives the request, retrieves the requested webpage, and sends it back to your browser, which then renders the content for you to view.
This entire process typically happens in milliseconds, allowing you to access websites quickly and seamlessly. DNS is crucial to the functionality of the internet, as it ensures that users can easily navigate to websites using human-readable domain names rather than having to remember complex IP addresses.
In contrast to HTTP, HTTPS connections perform to actions to counter two types of cyber security threats:
Setting up the HTTPS connection is what is technically called a handshake. This is usually a one-way Secure Socket Layer (SSL). This is a handshake where only the identity of the server is authenticated, and not the identity of the user. To perform this authentication the server sends over its SSL certificate. The client has a Trust Store that it can consult to verify the HTTPS certificate. If the certificate itself not in the Trust Store, the client can still try if the issuer of the certificate of the certificate is trusted. This is called the certificate chain. You can actually take a look at your trust store in your browser quite easily.3
To me, the magic of HTTPS encryption is that a man in the middle cannot decrypt the communication even if it intercepts all information that the client and server have sent over in the handshake. I will briefly explain how this is achieved. This is where things start to get exciting.
The client and the server are going to need the same session key to encrypt and decrypt each other’s communication. However, they can not just send over, as a man in the middle could simply eavesdrop on the and decrypt the communication. Instead, they use something called a Diffie-Hellman (DH) key exchange.4
A core concept in DH is the concept of a one-way function. A one-way-function is a function that is fast to compute but hard to inverse. The most simple of a one-way-function involves modular arithmetic, which we will use in this example. Remember that the modulus of a number is what remains after division by another number. For example: .5
DH key exchange exploits two tricks of modular arithmetic. The first is the Chinese Remainder Theorem, which states that we can calculate in if we have a couple of and values. The second trick is that only two of these equations exist under the condition that is a prime and is a primitive root modulo . When is a prime, only two of these equations exist since a prime only has two divisors.
The server and client work together in sharing their components of the equation to help the other construct their equation in three steps:
Now, both the server and client can compute session key . For the server now has , and the client . The session keys can now be used to encrypt and decrypt communication between client and server. Note that a man in the middle would have had to acquire either local secrets or to access session key and decrypt de communication. All of this happens every time you make an HTTPS connection. Crazy right? If you were not sure if you wanted HTTPS on your site before, I am sure you would want it now.
This part might come to you as a surprise. “If we have a domain, and AWS S3 can provide a storage location that serves
static websites, why would we need CloudFront?” you might wonder. When distributing your data directly through S3, you
would be left with two issues: First, if people would visit your naked domain maartenpoirot.com
they would not be
redirected to www.maartenpoirot.com
, and the page would not be found. Second, the website endpoint of S3 does not
support SSL connections.6 However, the Representational State Transfer (REST) endpoints of S3 do. Thus, we need
CloudFront to redirect and access the bucket through the REST API.
Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can store and protect any amount of data for virtually any use case, such as data lakes, cloud-native applications, and mobile apps. With cost-effective storage classes and easy-to-use management features, you can optimize costs, organize data, and configure fine-tuned access controls to meet specific business, organizational, and compliance requirements.