Securing S3 Bucket From Direct Access

After implementing an AWS CloudFront distribution for serving content from AWS S3 it is best practice to prevent direct access to the S3 bucket. This will prevent duplicate content issues on search engines and will also mean your content can only be accessed by the domains you expect.

# Pre-requisites

This article will be based on the blog described in Hexo, AWS and Serverless Framework and

Securing A Test Environment Using AWS WAF; however, it should be easy to apply this logic to any project hosting

on AWS and managed with Serverless Framework.

# Create an Origin Access Identity

The first part of removing direct S3 access is to create an identity for the AWS CloudFront distribution to use to identify itself to AWS S3. To do this a CloudFrontOriginAccessIdentity resource needs to be created.

  CloudFrontIdentity:
    Type: AWS::CloudFront::CloudFrontOriginAccessIdentity
    Properties:
      CloudFrontOriginAccessIdentityConfig:
        Comment:
          Fn::Join:
            - " "
            - - ${self:custom.domain.domainname}
              - CloudFront
              - Identity

# Apply the Identity to the CloudFront Distribution

To apply the previous identity to the AWS CloudFront distribution the distribution configuration in the resources.yml file needs to be updated. The setting is specific to an origin within the configuration and can only be applied to an S3 Origin.

          - DomainName:
              Fn::GetAtt:
                - WebsiteS3Bucket
                - DomainName
            Id: defaultOrigin
            S3OriginConfig:
              OriginAccessIdentity:
                Fn::Join:
                  - "/"
                  - - origin-access-identity
                    - cloudfront
                    - Ref: CloudFrontIdentity

# Add Functionality to Display index.html Files By Default

As documented on AWS CloudFront Developer Guide - Specifying a Default Root Object:

However, if you define a default root object, an end-user request for a subdirectory of your distribution does not return the default root object.

[…]

The behavior of CloudFront default root objects is different from the behavior of Amazon S3 index documents. When you configure an Amazon S3 bucket as a website and specify the index document, Amazon S3 returns the index document even if a user requests a subdirectory in the bucket.

This means logic has to be added to display the index.html files Hexo generates in each directory. The first step to implement this is to create a AWS Lambda function; then add the function to Serverless Framework’s configuration; and finally hook the function to the AWS CloudFront distribution.

Start by installing serverless-plugin-cloudfront-lambda-edge.

npm i @silvermine/serverless-plugin-cloudfront-lambda-edge --save-dev

Add the plugin to the serverless.yl file.

  - '@silvermine/serverless-plugin-cloudfront-lambda-edge'

Then create a new file at functions/urlRewrite.js with the AWS Lambda function (in this case it’s Javascript).

'use strict';
exports.handler = (event, context, callback) => {

    // Extract the request from the CloudFront event that is sent to Lambda@Edge
    var request = event.Records[0].cf.request;

    // Extract the URI from the request
    var olduri = request.uri;

    // Match any '/' that occurs at the end of a URI. Replace it with a default index
    var newuri = olduri.replace(/\/$/, '\/index.html');

    // Log the URI as received by CloudFront and the new URI to be used to fetch from origin
    console.log("Old URI: " + olduri);
    console.log("New URI: " + newuri);

    // Replace the received URI with the URI that includes the index page
    request.uri = newuri;

    // Return to CloudFront
    return callback(null, request);
};

Serverless Framework now needs to know about the new function so it can coordinate the deployment for the code, configuration of AWS Lambda, and linking the function to the distribution. This is done by adding a new top level element to the serverless.yml file.

# Define the Lambda functions for the site
functions:
  # This function will be deployed to Lambda@Edge and rewrite URLs to include index.html
  urlrewrite:
    name: ${self:service}-${self:custom.stage}-cf-url-rewriter
    handler: functions/urlRewrite.handler
    memorySize: 128
    timeout: 1
    lambdaAtEdge:
      distribution: WebsiteCloudFrontDistribution
      eventType: origin-request

# Update the S3 Bucket Policy

Now an identifier for AWS CloudFront has been configured the policy on the AWS S3 bucket can be restricted to only allow access via the CloudFront distribution.

In the resources.yml file there is a Statement for the S3 Bucket Policy. This needs to be adjusted to remove public access and grant access to CloudFront.

          - Sid: CloudFrontForGetBucketObjects
            Effect: Allow
            Principal:
              CanonicalUser:
                Fn::GetAtt:
                  - CloudFrontIdentity
                  - S3CanonicalUserId
            Action: 's3:GetObject'
            Resource:
              Fn::Join:
                - ''
                -
                  - 'arn:aws:s3:::'
                  - Ref: WebsiteS3Bucket
                  - /*

# Configure File Uploads to be Private

Within the main serverless.yml file, under the custom.assets.targets element, configuration needs to be added to ensure all files are uploaded as private.

        acl: private
        acl: private

# Update the S3 Bucket

All the requirements are in place for AWS CloudFront to be able to access the AWS S3 bucket contents. All that’s left to do it remove all public access from the bucket.

To do this locate the bucket configuration in resources.yml, update the AccessControl to BucketOwnerFullControl and delete the WebsiteConfiguration.

    Properties:
      AccessControl: BucketOwnerFullControl
      BucketName: ${self:custom.domain.domainname}
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true

# Testing

Prior to these changes the website was accessible from:

  • https://s3-us-west-2.amazonaws.com/<domain>/index.html
  • http://<domain>.s3-website-us-west-2.amazonaws.com/
  • https://<domain>.s3-us-west-2.amazonaws.com/index.html
  • https://<domain>/index.html

Of these only the last one should continue to work.

# The Final Configuration Files

# The name of your project
service: **project**

# Plugins for additional Serverless functionality
plugins:
  - serverless-s3-deploy
  - serverless-plugin-scripts
  - '@silvermine/serverless-plugin-cloudfront-lambda-edge'

# Configuration for AWS
provider:
  name: aws
  runtime: nodejs8.10
  profile: serverless
  # Some future functionality requires us to use us-east-1 at this time
  region: us-east-1

  # This enables us to use the default stage definition, but override it from the command line
  stage: ${opt:stage, self:provider.stage}
  # This enables us to prepend the stage name for non-production environments
  domain:
    fulldomain:
      prod: ${self:custom.domain.domain}
      other: ${self:custom.stage}.${self:custom.domain.domain}
    # This value has been customised so I can maintain multiple demonstration sites
    domain: ${self:custom.postname}.${self:custom.domain.zonename}
    domainname: ${self:custom.domain.fulldomain.${self:custom.stage}, self:custom.domain.fulldomain.other}
    # DNS Zone name (this is only required so I can maintain multiple demonstration sites)
    zonename: alphageek.com.au
    cacheControlMaxAgeHTMLByStage:
      # HTML Cache time for production environment
      prod: 3600
      # HTML Cache time for other environments
      other: 0
    cacheControlMaxAgeHTML: ${self:custom.domain.cacheControlMaxAgeHTMLByStage.${self:custom.stage}, self:custom.domain.cacheControlMaxAgeHTMLByStage.other}
    sslCertificateARN: arn:aws:acm:us-east-1:165657443288:certificate/61d202ea-12f2-4282-b602-9c3b83183c7a
  assets:
    targets:
      # Configuration for HTML files (overriding the default cache control age)
      - bucket:
          Ref: WebsiteS3Bucket
        acl: private
        files:
          - source: ./public/
            headers:
              CacheControl: max-age=${self:custom.domain.cacheControlMaxAgeHTML}
            empty: true
            globs:
              - '**/*.html'
      # Configuration for all assets
      - bucket:
          Ref: WebsiteS3Bucket
        acl: private
        files:
          - source: ./public/
            empty: true
            globs:
              - '**/*.js'
              - '**/*.css'
              - '**/*.jpg'
              - '**/*.png'
              - '**/*.gif'
  scripts:
    hooks:
      # Run these commands when creating the deployment artifacts
      package:createDeploymentArtifacts: >
        hexo clean &&
        hexo generate
      # Run these commands after infrastructure changes have been completed
      deploy:finalize: >
        sls s3deploy -s ${self:custom.stage}
  # AWS Region to S3 website hostname mapping
  s3DNSName:
    us-east-2: s3-website.us-east-2.amazonaws.com
    us-east-1: s3-website-us-east-1.amazonaws.com
    us-west-1: s3-website-us-west-1.amazonaws.com
    us-west-2: s3-website-us-west-2.amazonaws.com
    ap-south-1: s3-website.ap-south-1.amazonaws.com
    ap-northeast-3: s3-website.ap-northeast-3.amazonaws.com
    ap-northeast-2: s3-website.ap-northeast-2.amazonaws.com
    ap-southeast-1: s3-website-ap-southeast-1.amazonaws.com
    ap-southeast-2: s3-website-ap-southeast-2.amazonaws.com
    ap-northeast-1: s3-website-ap-northeast-1.amazonaws.com
    ca-central-1: s3-website.ca-central-1.amazonaws.com
    eu-central-1: s3-website.eu-central-1.amazonaws.com
    eu-west-1: s3-website-eu-west-1.amazonaws.com
    eu-west-2: s3-website.eu-west-2.amazonaws.com
    eu-west-3: s3-website.eu-west-3.amazonaws.com
    eu-north-1: s3-website.eu-north-1.amazonaws.com
    sa-east-1: s3-website-sa-east-1.amazonaws.com
  # Determine what resources file to include based on the current stage
  customConfigFile: ${self:custom.customConfigFiles.${self:custom.stage}, self:custom.customConfigFiles.other}
  customConfigFiles:
    prod: prod
    other: other

# Define the Lambda functions for the site
functions:
  # This function will be deployed to Lambda@Edge and rewrite URLs to include index.html
  urlrewrite:
    name: ${self:service}-${self:custom.stage}-cf-url-rewriter
    handler: functions/urlRewrite.handler
    memorySize: 128
    timeout: 1
    lambdaAtEdge:
      distribution: WebsiteCloudFrontDistribution
      eventType: origin-request

# Define the resources we will need to host the site
resources:
  # Include the resources file
  - ${file(config/resources.yml)}
  # Include the outputs file
  - ${file(config/outputs.yml)}
  # Include a custom configuration file based on the environment
  - ${file(config/resources/environment/${self:custom.customConfigFile}.yml)}
Resources:
  # Set-up an S3 bucket to store the site
  WebsiteS3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: BucketOwnerFullControl
      BucketName: ${self:custom.domain.domainname}
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
  # Set-up a policy on the bucket so it can be used as a website
  WebsiteBucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      PolicyDocument:
        Id:
          Fn::Join:
            - ""
            - - ${self:service.name}
              - BucketPolicy
        Statement:
          - Sid: CloudFrontForGetBucketObjects
            Effect: Allow
            Principal:
              CanonicalUser:
                Fn::GetAtt:
                  - CloudFrontIdentity
                  - S3CanonicalUserId
            Action: 's3:GetObject'
            Resource:
              Fn::Join:
                - ''
                -
                  - 'arn:aws:s3:::'
                  - Ref: WebsiteS3Bucket
                  - /*
      Bucket:
        Ref: WebsiteS3Bucket
  # Configure CloudFront to get all content from S3
  WebsiteCloudFrontDistribution:
    Type: 'AWS::CloudFront::Distribution'
    Properties:
      DistributionConfig:
        WebACLId:
          Ref: CustomAuthorizationHeaderRestriction
        Aliases:
          - ${self:custom.domain.domainname}
          - www.${self:custom.domain.domainname}
        CustomErrorResponses:
          - ErrorCode: '404'
            ResponsePagePath: "/error.html"
            ResponseCode: '200'
            ErrorCachingMinTTL: '30'
        DefaultCacheBehavior:
          Compress: true
          ForwardedValues:
            QueryString: false
            Cookies:
              Forward: all
          SmoothStreaming: false
          TargetOriginId: defaultOrigin
          ViewerProtocolPolicy: redirect-to-https
        DefaultRootObject: index.html
        Enabled: true
        Origins:
          - DomainName:
              Fn::GetAtt:
                - WebsiteS3Bucket
                - DomainName
            Id: defaultOrigin
            S3OriginConfig:
              OriginAccessIdentity:
                Fn::Join:
                  - "/"
                  - - origin-access-identity
                    - cloudfront
                    - Ref: CloudFrontIdentity
        PriceClass: PriceClass_All
        ViewerCertificate:
          AcmCertificateArn: ${self:custom.domain.sslCertificateARN}
          SslSupportMethod: sni-only
  # DNS Record for the domain
  WebsiteDNSRecord:
    Type: "AWS::Route53::RecordSet"
    Properties:
      AliasTarget:
        DNSName:
          Fn::GetAtt:
            - WebsiteCloudFrontDistribution
            - DomainName
        HostedZoneId: Z2FDTNDATAQYW2
      HostedZoneName: ${self:custom.domain.domain}.
      Name: ${self:custom.domain.domainname}
      Type: 'A'
  # DNS Record for www.domain
  WebsiteWWWDNSRecord:
    Type: "AWS::Route53::RecordSet"
    Properties:
      AliasTarget:
        DNSName:
          Fn::GetAtt:
            - WebsiteCloudFrontDistribution
            - DomainName
        HostedZoneId: Z2FDTNDATAQYW2
      HostedZoneName: ${self:custom.domain.domain}.
      Name: www.${self:custom.domain.domainname}
      Type: 'A'
  # Predicate to match the authorization header
  CustomAuthorizationHeader:
    Type: AWS::WAF::ByteMatchSet
    Properties:
      ByteMatchTuples:
        -
          FieldToMatch:
            Type: HEADER
            Data: Authorization
          TargetString:
            Fn::Join:
              - " "
              - - Custom
                - "**Password**"
          TextTransformation: NONE
          PositionalConstraint: EXACTLY
      Name:
        Fn::Join:
          - "_"
          - - ${self:custom.domain.domainname}
            - Authorization
            - Header
  CustomAuthorizationHeaderRule:
    Type: AWS::WAF::Rule
    Properties:
      Name:
        Fn::Join:
          - "_"
          - - ${self:custom.domain.domainname}
            - Authorization
            - Header
            - Rule
      MetricName:
        Fn::Join:
          - ""
          - - ${self:custom.stage}
            - ${self:service.name}
            - Authorization
            - Header
            - Rule
      Predicates:
        -
          DataId:
            Ref: CustomAuthorizationHeader
          Negated: false
          Type: ByteMatch
Outputs:
  WebsiteURL:
    Value:
      Fn::GetAtt:
        - WebsiteS3Bucket
        - WebsiteURL
    Description: URL for my website hosted on S3
  S3BucketSecureURL:
    Value:
      Fn::Join:
        - ''
        -
          - 'https://'
          - Fn::GetAtt:
              - WebsiteS3Bucket
              - DomainName
    Description: Secure URL of S3 bucket to hold website content
Resources:
  # Allow the custom authorisation header in the production environment
  CustomAuthorizationHeaderRestriction:
    Type: AWS::WAF::WebACL
    Properties:
      DefaultAction:
        Type: ALLOW
      Name:
        Fn::Join:
          - "_"
          - - ${self:custom.domain.domainname}
            - Authorization
            - Header
            - Restriction
      MetricName:
        Fn::Join:
          - ""
          - - ${self:custom.stage}
            - ${self:service.name}
            - Authorization
            - Header
            - Restriction
      Rules:
        -
          Action:
            Type: ALLOW
          Priority: 1
          RuleId:
            Ref: CustomAuthorizationHeaderRule
Resources:
  # Require the custom authorisation header with the correct password in non-production environment
  CustomAuthorizationHeaderRestriction:
    Type: AWS::WAF::WebACL
    Properties:
      DefaultAction:
        Type: BLOCK
      Name:
        Fn::Join:
          - "_"
          - - ${self:custom.domain.domainname}
            - Authorization
            - Header
            - Restriction
      MetricName:
        Fn::Join:
          - ""
          - - ${self:custom.stage}
            - ${self:service.name}
            - Authorization
            - Header
            - Restriction
      Rules:
        -
          Action: ALLOW
          Priority: 1
          RuleId:
            Ref: CustomAuthorizationHeaderRule

# Example Site