Writing about Cloud, architecture, AWS, GCP and software engineering.

Best practices for S3 web hosting and explaining why

February 3, 2022
Source code: GitHub

There are a lot of very good resources explaining how to set up a S3 website. But not explaining why you should choose one option over the other.

In this article I will not explain step by step how to set up a S3 website. If you are looking for that, I have added some links at the bottom of this article. Instead I will give you the best practices and support those with an explanation.

I will start my article with the best practices and further explain these below in the article.

Best practices for S3 website hosting

Best practices for hosting your S3 website with SSL in short:

Use a public bucket instead of private bucket

Use a public bucket and only allow requests with a secret Referer header, preventing direct access to your S3 bucket.

Use CloudFront instead of Route 53

By using CloudFront you gain SSL, caching and speed up your website. Make sure you also set up a second CloudFront distribution for any redirect S3 buckets

Use the Website hosting endpoint instead of the S3 REST api endpoint

In CloudFront make sure you set up your origin domain to route to the S3 website hosting endpoint. Do not use the S3 REST api endpoint, because you lose the web hosting optimizations.

  • Example of the S3 website endpoint: example.com.s3-website.eu-west-1.amazonaws.com
  • Example of the REST API endpoint: example.com.s3.eu-west-1.amazonaws.com

Explaining the best practices

Below you will find explanations why I think the best practices mentioned above are best practices :)

Private bucket vs Public bucket

I wanted to have a private website bucket and only have CloudFront access it. This is possible by using an access identity (OAI) read more here, but has a downside, more on that later.

I started thinking about whether the bucket should be private. My conclusion is, a public bucket is perfectly fine because your static website files are already accessible through CloudFront. And you should not have any sensitive files in that bucket. So you won’t gain a lot by making the bucket private.

Still I wanted to prevent direct requests to my S3 bucket, so found a solution for this which I will describe below.

By adding a S3 bucket policy which only allows s3:GetObject requests when a secret referer header is set. This header is also added to the CloudFront distribution as a custom header. This way CloudFront always adds the referer header to the requests and is allowed to get the objects from the bucket.

Example of the S3 bucket policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::example.com/*",
                "arn:aws:s3:::example.com"
            ],
            "Condition": {
                "StringLike": {
                    "aws:Referer": "ZJAibno2C7NjHqBmqh9Q"
                }
            }
        }
    ]
}

Downside of using OAI with CloudFront for your website bucket

I would like to start with, using a private bucket with CloudFront by setting up OAI is preferred if you don’t use the S3 website hosting feature. In some cases using hosting a Single Page Applications it would be possible to use private bucket. Which means that some SPA’s don’t experience the downside described below.

The downside below, is only applicable for private website buckets were you access the S3 REST API from CloudFront.

The Amazon S3 website endpoint is optimized for access from a web browser. It is a public endpoint, meaning that if your bucket is private, CloudFront is not able to access the objects any more through the website endpoint. This is an example of the S3 website endpoint: example.com.s3-website.eu-west-1.amazonaws.com.

This means you have to access your S3 website through the REST API endpoint. Example of the REST API endpoint: example.com.s3.eu-west-1.amazonaws.com To access the private bucket through CloudFront you have to set up origin access identity (OAI). And you will lose the S3 website hosting functionality, meaning your objects get served without the web hosting optimizations.

This results in the need to add .html extensions to your URLs. This is because you are accessing the objects directly. And the S3 website hosting feature is not removing the .html when accessing the object. URLs with the .html extension doesn’t look very nice IMO.

Now I hear you think, ‘but you can solve this by setting the default root object in your CloudFront distribution’. But this only works for the root object. Read more about it here: How headers work with default root objects. If you request a subdirectory you need to add the .html again which again doesn’t look very nice.

Another option would be to remove the .html extension from the objects in the bucket and do something with the MIME type. But this doesn’t sound like a very elegant solution to me.

Route 53 vs CloudFront for S3 website hosting

You can set up your S3 website bucket to be directly accessed with the use of a DNS record in Route 53. Or you could place a CloudFront distribution in front of the S3 bucket. Below I will explain why I think you should use CloudFront.

I think CloudFront is the best solution because of the following advantages CloudFront brings.

  • Serving your website using SSL (HTTPS)
  • Request will be cached
  • Increase website performance
  • The ability to set common security headers

When using Route 53 without CloudFront you won’t have SSL or caching. And an additional potential downside is that your bucket and domain name must be identical. Read more here about why your domain and bucket name must be identical

Useful links

Bonus: Terraform template

I have created a Terraform template using these best practices which can be found here: https://github.com/tiborhercz/s3-website-cloudfront-terraform