After implementing an AWS CloudFront distribution for serving content from AWS S3 it is best practice to prevent direct access to the S3 bucket. This will prevent duplicate content issues on search engines and will also mean your content can only be accessed by the domains you expect.
# Pre-requisites
This article will be based on the blog described in Hexo, AWS and Serverless Framework and
Securing A Test Environment Using AWS WAF; however, it should be easy to apply this logic to any project hostingon AWS and managed with Serverless Framework.
# Create an Origin Access Identity
The first part of removing direct S3 access is to create an identity for the AWS CloudFront distribution to use to identify itself to AWS S3. To do this a CloudFrontOriginAccessIdentity resource needs to be created.
CloudFrontIdentity:
Type: AWS::CloudFront::CloudFrontOriginAccessIdentity
Properties:
CloudFrontOriginAccessIdentityConfig:
Comment:
Fn::Join:
- " "
- - ${self:custom.domain.domainname}
- CloudFront
- Identity
# Apply the Identity to the CloudFront Distribution
To apply the previous identity to the AWS CloudFront distribution the distribution
configuration in the resources.yml
file needs to be updated. The setting is specific to an origin within the
configuration and can only be applied to an S3 Origin
.
- DomainName:
Fn::GetAtt:
- WebsiteS3Bucket
- DomainName
Id: defaultOrigin
S3OriginConfig:
OriginAccessIdentity:
Fn::Join:
- "/"
- - origin-access-identity
- cloudfront
- Ref: CloudFrontIdentity
# Add Functionality to Display index.html
Files By Default
As documented on AWS CloudFront Developer Guide - Specifying a Default Root Object:
However, if you define a default root object, an end-user request for a subdirectory of your distribution does not return the default root object.
[…]
The behavior of CloudFront default root objects is different from the behavior of Amazon S3 index documents. When you configure an Amazon S3 bucket as a website and specify the index document, Amazon S3 returns the index document even if a user requests a subdirectory in the bucket.
This means logic has to be added to display the index.html
files Hexo generates in each directory.
The first step to implement this is to create a AWS Lambda function; then add the
function to Serverless Framework’s configuration; and finally hook the function to
the AWS CloudFront distribution.
Start by installing serverless-plugin-cloudfront-lambda-edge.
npm i @silvermine/serverless-plugin-cloudfront-lambda-edge --save-dev
Add the plugin to the serverless.yl
file.
- '@silvermine/serverless-plugin-cloudfront-lambda-edge'
Then create a new file at functions/urlRewrite.js
with the AWS Lambda function (in
this case it’s Javascript).
'use strict';
exports.handler = (event, context, callback) => {
// Extract the request from the CloudFront event that is sent to Lambda@Edge
var request = event.Records[0].cf.request;
// Extract the URI from the request
var olduri = request.uri;
// Match any '/' that occurs at the end of a URI. Replace it with a default index
var newuri = olduri.replace(/\/$/, '\/index.html');
// Log the URI as received by CloudFront and the new URI to be used to fetch from origin
console.log("Old URI: " + olduri);
console.log("New URI: " + newuri);
// Replace the received URI with the URI that includes the index page
request.uri = newuri;
// Return to CloudFront
return callback(null, request);
};
Serverless Framework now needs to know about the new function so it can coordinate
the deployment for the code, configuration of AWS Lambda, and linking the function to
the distribution. This is done by adding a new top level element to the serverless.yml
file.
# Define the Lambda functions for the site
functions:
# This function will be deployed to Lambda@Edge and rewrite URLs to include index.html
urlrewrite:
name: ${self:service}-${self:custom.stage}-cf-url-rewriter
handler: functions/urlRewrite.handler
memorySize: 128
timeout: 1
lambdaAtEdge:
distribution: WebsiteCloudFrontDistribution
eventType: origin-request
# Update the S3 Bucket Policy
Now an identifier for AWS CloudFront has been configured the policy on the AWS S3 bucket can be restricted to only allow access via the CloudFront distribution.
In the resources.yml
file there is a Statement
for the S3 Bucket Policy
. This needs to be adjusted to remove
public access and grant access to CloudFront.
- Sid: CloudFrontForGetBucketObjects
Effect: Allow
Principal:
CanonicalUser:
Fn::GetAtt:
- CloudFrontIdentity
- S3CanonicalUserId
Action: 's3:GetObject'
Resource:
Fn::Join:
- ''
-
- 'arn:aws:s3:::'
- Ref: WebsiteS3Bucket
- /*
# Configure File Uploads to be Private
Within the main serverless.yml
file, under the custom.assets.targets
element, configuration needs to be added to
ensure all files are uploaded as private.
acl: private
acl: private
# Update the S3 Bucket
All the requirements are in place for AWS CloudFront to be able to access the AWS S3 bucket contents. All that’s left to do it remove all public access from the bucket.
To do this locate the bucket configuration in resources.yml
, update the AccessControl
to BucketOwnerFullControl
and delete the WebsiteConfiguration
.
Properties:
AccessControl: BucketOwnerFullControl
BucketName: ${self:custom.domain.domainname}
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
# Testing
Prior to these changes the website was accessible from:
https://s3-us-west-2.amazonaws.com/<domain>/index.html
http://<domain>.s3-website-us-west-2.amazonaws.com/
https://<domain>.s3-us-west-2.amazonaws.com/index.html
https://<domain>/index.html
Of these only the last one should continue to work.
# The Final Configuration Files
# The name of your project
service: **project**
# Plugins for additional Serverless functionality
plugins:
- serverless-s3-deploy
- serverless-plugin-scripts
- '@silvermine/serverless-plugin-cloudfront-lambda-edge'
# Configuration for AWS
provider:
name: aws
runtime: nodejs8.10
profile: serverless
# Some future functionality requires us to use us-east-1 at this time
region: us-east-1
# This enables us to use the default stage definition, but override it from the command line
stage: ${opt:stage, self:provider.stage}
# This enables us to prepend the stage name for non-production environments
domain:
fulldomain:
prod: ${self:custom.domain.domain}
other: ${self:custom.stage}.${self:custom.domain.domain}
# This value has been customised so I can maintain multiple demonstration sites
domain: ${self:custom.postname}.${self:custom.domain.zonename}
domainname: ${self:custom.domain.fulldomain.${self:custom.stage}, self:custom.domain.fulldomain.other}
# DNS Zone name (this is only required so I can maintain multiple demonstration sites)
zonename: alphageek.com.au
cacheControlMaxAgeHTMLByStage:
# HTML Cache time for production environment
prod: 3600
# HTML Cache time for other environments
other: 0
cacheControlMaxAgeHTML: ${self:custom.domain.cacheControlMaxAgeHTMLByStage.${self:custom.stage}, self:custom.domain.cacheControlMaxAgeHTMLByStage.other}
sslCertificateARN: arn:aws:acm:us-east-1:165657443288:certificate/61d202ea-12f2-4282-b602-9c3b83183c7a
assets:
targets:
# Configuration for HTML files (overriding the default cache control age)
- bucket:
Ref: WebsiteS3Bucket
acl: private
files:
- source: ./public/
headers:
CacheControl: max-age=${self:custom.domain.cacheControlMaxAgeHTML}
empty: true
globs:
- '**/*.html'
# Configuration for all assets
- bucket:
Ref: WebsiteS3Bucket
acl: private
files:
- source: ./public/
empty: true
globs:
- '**/*.js'
- '**/*.css'
- '**/*.jpg'
- '**/*.png'
- '**/*.gif'
scripts:
hooks:
# Run these commands when creating the deployment artifacts
package:createDeploymentArtifacts: >
hexo clean &&
hexo generate
# Run these commands after infrastructure changes have been completed
deploy:finalize: >
sls s3deploy -s ${self:custom.stage}
# AWS Region to S3 website hostname mapping
s3DNSName:
us-east-2: s3-website.us-east-2.amazonaws.com
us-east-1: s3-website-us-east-1.amazonaws.com
us-west-1: s3-website-us-west-1.amazonaws.com
us-west-2: s3-website-us-west-2.amazonaws.com
ap-south-1: s3-website.ap-south-1.amazonaws.com
ap-northeast-3: s3-website.ap-northeast-3.amazonaws.com
ap-northeast-2: s3-website.ap-northeast-2.amazonaws.com
ap-southeast-1: s3-website-ap-southeast-1.amazonaws.com
ap-southeast-2: s3-website-ap-southeast-2.amazonaws.com
ap-northeast-1: s3-website-ap-northeast-1.amazonaws.com
ca-central-1: s3-website.ca-central-1.amazonaws.com
eu-central-1: s3-website.eu-central-1.amazonaws.com
eu-west-1: s3-website-eu-west-1.amazonaws.com
eu-west-2: s3-website.eu-west-2.amazonaws.com
eu-west-3: s3-website.eu-west-3.amazonaws.com
eu-north-1: s3-website.eu-north-1.amazonaws.com
sa-east-1: s3-website-sa-east-1.amazonaws.com
# Determine what resources file to include based on the current stage
customConfigFile: ${self:custom.customConfigFiles.${self:custom.stage}, self:custom.customConfigFiles.other}
customConfigFiles:
prod: prod
other: other
# Define the Lambda functions for the site
functions:
# This function will be deployed to Lambda@Edge and rewrite URLs to include index.html
urlrewrite:
name: ${self:service}-${self:custom.stage}-cf-url-rewriter
handler: functions/urlRewrite.handler
memorySize: 128
timeout: 1
lambdaAtEdge:
distribution: WebsiteCloudFrontDistribution
eventType: origin-request
# Define the resources we will need to host the site
resources:
# Include the resources file
- ${file(config/resources.yml)}
# Include the outputs file
- ${file(config/outputs.yml)}
# Include a custom configuration file based on the environment
- ${file(config/resources/environment/${self:custom.customConfigFile}.yml)}
Resources:
# Set-up an S3 bucket to store the site
WebsiteS3Bucket:
Type: AWS::S3::Bucket
Properties:
AccessControl: BucketOwnerFullControl
BucketName: ${self:custom.domain.domainname}
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
# Set-up a policy on the bucket so it can be used as a website
WebsiteBucketPolicy:
Type: AWS::S3::BucketPolicy
Properties:
PolicyDocument:
Id:
Fn::Join:
- ""
- - ${self:service.name}
- BucketPolicy
Statement:
- Sid: CloudFrontForGetBucketObjects
Effect: Allow
Principal:
CanonicalUser:
Fn::GetAtt:
- CloudFrontIdentity
- S3CanonicalUserId
Action: 's3:GetObject'
Resource:
Fn::Join:
- ''
-
- 'arn:aws:s3:::'
- Ref: WebsiteS3Bucket
- /*
Bucket:
Ref: WebsiteS3Bucket
# Configure CloudFront to get all content from S3
WebsiteCloudFrontDistribution:
Type: 'AWS::CloudFront::Distribution'
Properties:
DistributionConfig:
WebACLId:
Ref: CustomAuthorizationHeaderRestriction
Aliases:
- ${self:custom.domain.domainname}
- www.${self:custom.domain.domainname}
CustomErrorResponses:
- ErrorCode: '404'
ResponsePagePath: "/error.html"
ResponseCode: '200'
ErrorCachingMinTTL: '30'
DefaultCacheBehavior:
Compress: true
ForwardedValues:
QueryString: false
Cookies:
Forward: all
SmoothStreaming: false
TargetOriginId: defaultOrigin
ViewerProtocolPolicy: redirect-to-https
DefaultRootObject: index.html
Enabled: true
Origins:
- DomainName:
Fn::GetAtt:
- WebsiteS3Bucket
- DomainName
Id: defaultOrigin
S3OriginConfig:
OriginAccessIdentity:
Fn::Join:
- "/"
- - origin-access-identity
- cloudfront
- Ref: CloudFrontIdentity
PriceClass: PriceClass_All
ViewerCertificate:
AcmCertificateArn: ${self:custom.domain.sslCertificateARN}
SslSupportMethod: sni-only
# DNS Record for the domain
WebsiteDNSRecord:
Type: "AWS::Route53::RecordSet"
Properties:
AliasTarget:
DNSName:
Fn::GetAtt:
- WebsiteCloudFrontDistribution
- DomainName
HostedZoneId: Z2FDTNDATAQYW2
HostedZoneName: ${self:custom.domain.domain}.
Name: ${self:custom.domain.domainname}
Type: 'A'
# DNS Record for www.domain
WebsiteWWWDNSRecord:
Type: "AWS::Route53::RecordSet"
Properties:
AliasTarget:
DNSName:
Fn::GetAtt:
- WebsiteCloudFrontDistribution
- DomainName
HostedZoneId: Z2FDTNDATAQYW2
HostedZoneName: ${self:custom.domain.domain}.
Name: www.${self:custom.domain.domainname}
Type: 'A'
# Predicate to match the authorization header
CustomAuthorizationHeader:
Type: AWS::WAF::ByteMatchSet
Properties:
ByteMatchTuples:
-
FieldToMatch:
Type: HEADER
Data: Authorization
TargetString:
Fn::Join:
- " "
- - Custom
- "**Password**"
TextTransformation: NONE
PositionalConstraint: EXACTLY
Name:
Fn::Join:
- "_"
- - ${self:custom.domain.domainname}
- Authorization
- Header
CustomAuthorizationHeaderRule:
Type: AWS::WAF::Rule
Properties:
Name:
Fn::Join:
- "_"
- - ${self:custom.domain.domainname}
- Authorization
- Header
- Rule
MetricName:
Fn::Join:
- ""
- - ${self:custom.stage}
- ${self:service.name}
- Authorization
- Header
- Rule
Predicates:
-
DataId:
Ref: CustomAuthorizationHeader
Negated: false
Type: ByteMatch
Outputs:
WebsiteURL:
Value:
Fn::GetAtt:
- WebsiteS3Bucket
- WebsiteURL
Description: URL for my website hosted on S3
S3BucketSecureURL:
Value:
Fn::Join:
- ''
-
- 'https://'
- Fn::GetAtt:
- WebsiteS3Bucket
- DomainName
Description: Secure URL of S3 bucket to hold website content
Resources:
# Allow the custom authorisation header in the production environment
CustomAuthorizationHeaderRestriction:
Type: AWS::WAF::WebACL
Properties:
DefaultAction:
Type: ALLOW
Name:
Fn::Join:
- "_"
- - ${self:custom.domain.domainname}
- Authorization
- Header
- Restriction
MetricName:
Fn::Join:
- ""
- - ${self:custom.stage}
- ${self:service.name}
- Authorization
- Header
- Restriction
Rules:
-
Action:
Type: ALLOW
Priority: 1
RuleId:
Ref: CustomAuthorizationHeaderRule
Resources:
# Require the custom authorisation header with the correct password in non-production environment
CustomAuthorizationHeaderRestriction:
Type: AWS::WAF::WebACL
Properties:
DefaultAction:
Type: BLOCK
Name:
Fn::Join:
- "_"
- - ${self:custom.domain.domainname}
- Authorization
- Header
- Restriction
MetricName:
Fn::Join:
- ""
- - ${self:custom.stage}
- ${self:service.name}
- Authorization
- Header
- Restriction
Rules:
-
Action: ALLOW
Priority: 1
RuleId:
Ref: CustomAuthorizationHeaderRule
# Example Site
- Demonstration website
- S3 Bucket Website [No longer working]
- Code Repository