Skip to content

S3

  • Simple Storage Service (S3)
  • Give devs and IT team secure, durable, scalable object storage
    • Not for OS and database
    • Data spread across multiple devices and facilities so these are HA
  • Object based
    • Objects can be 0 Bytest to 5 TB
  • Larges size file that can transfer to S3 using PUT is 5 GB
  • Unlimited Storage
  • Files are stored in Buckets (similar to folder)
  • S3 is a universal namespace (means they are a global service)
  • When upload is successful you recieve a HTTP 200 status code
  • Data consitency
  • Read after Write consistency for PUTS of new Objects
    • Means that as soon as file is uploaded it is available to read
  • Eventual Consistency for overwrite PUTS and DELETES
    • Means can take some time for changes to propegate
  • S3 Under Hood
  • S3 is object based and objects conists of the following
    • Key (This is the name of an object)
    • Value (This is the data, sequence of bytes)
    • Version ID (Allows for versioning and rolling back)
    • Metadata (Data about the data)
    • Subresources - Bucket specfic configurations
    • Bucket Policies, access control lists
    • Cross Origin Resource Sharing (CORS)
    • Transfer Acceleration
    • Allows for increase transfer speed

The Basics

  • Built for 99.99% availability for S3 platform
  • Amazon Gurantees 99.9% availability
  • Amazon Gurantees 99.999999999% (the 11 9's) durability for S3 information
  • This means they expect you to lose 1 file ever 10,000 years
  • Tiered Storage Available
  • Lifecycle Management
  • Versioning
  • Encryption
  • Secure data access with Access Control lists

Storage Tiers/Classes

S3 (standard)

  • 99.99% availability
  • 11 9s durability
  • Redundancy exists across multiple devices in multiple facilities and is designed to sustain loss of 2 facilities concurrently

S3 IA (Infrequently Accessed)

  • Data that is access less frequently but requires rapid access when needed
  • Lower fee than S3 but chared with access data

S3 One Zone IA

  • Same as IA however data is stored in a single AZ
  • Still contains 11 9s durability
  • Only 99.5% availability
  • Cost 20% less than regular S3 IA

S3 Reduced Redundancy Storage

  • Designed with only 99.99% durability
  • 99.99% Availability
  • Meant for data that is easily recreatable is lost
  • Not offered in some regions
  • AWS recommends NOT using this class
  • Recommended to use S3 standard
  • S3 standard price was reduce to be lower than this to help move people off

S3 Glacier

  • Designed for Data that is infrequently accessed
  • Very Cheap
  • Takes 3-5 hours to retrieve data

S3 Deep Archive Glacier

  • Glacier but more infrequent
  • 12 hour retrieval time

S3 Intelligent Tiering

  • Designed for data with unknown or unpredictable access patterns
  • 2 tiers
  • frequent
  • infrequent
  • Objects are automatically moved around between tiers depending on access patterns
  • 11 9s durability
  • 99.99% accessibility
  • Small Monthly fee for monitoring/automation

Charges

  • Storage per GB
  • Requests (GET, PUT, COPY, ect)
  • Storage Mangement Pricing
  • Inventory, Analytics, Object Tags
  • Data Management Pricing
  • Free transfer in
  • Charge for transfer in
  • Transfer Acceleration
  • Use of CloudFront to optimize speed

S3 ACLs & Policies

Access Control List ACL

  • Per file/object basis
  • By default only access to owner of file
  • Fine grain access control

Bucket Policies

  • Documents that govern what settings for a given resource
  • JSON objects
  • Can use policy generator so you dont have to write JSON
  • Will generate JSON code that you can copy and paste into policy editor
  • Sometimes the policy generator will need to have a wildcard operator added to the resource

S3 Encryption

  • Encryption
  • In Transit (Encrypt data being sent to and from bucket)
    • Done with SSL/TLS
  • At Rest (Encrypt data that is in bucket)
    • Server Side Encryption
    • S3 Managed Keys - SSE-S3
      • Amazon does everything
      • Encrypts data
      • Encrypts Keys
    • AWS Key Management Service, Managed Keys, SSE-KMS
      • Get a seperate key called "Envelope Key"
      • This is key that encrypts the key
      • Get Audit trail to see when and who has used encryption key
      • Can use your own key
    • Server side Encryption with customer Provided Key
      • AWS manages encryption and decryption
      • User manages their own keys
    • Client Side Encryption
    • User encrypts files themselves before uploading files into S3
  • Enforcing Encryption
  • Every upload sends a PUT request
  • Request header contains information for S3 to understand what needs to be done
  • x-amz-server-side-encryption parameter included in request header if file needs encrypted at upload time
    • Two options available
    • x-amz-server-side-encryption: AES256 (SSE-S3 - S3 managed keys)
    • x-amz-server-side-encryption: ams:kms (SSE-KMS - KMS managed keys)
  • In order to enforce encryption all we need to do is make a bucket policy that denies any S3 PUT requests which do not include the x-amz-server-side-encryption
  • How to enable
    • Can be done via the the console (which is the easy way)
    • Can be done with an S3 Bucket Policy (Written in JSON)

Cross-Access-Resource-Sharing (CORS)

  • Allows access from one resource (bucket) to another
  • If you open to all you leave your resources exposed and vulnerable to attack

S3 Perfomance Optimization

  • (This has been deprecated advice since AWS increased S3 performance in 2018)
  • By default S3 is designed to handle very high request rates
  • If you are recieving over 100 PUT/LIST or 300 DELETE per second then you need to optimize
  • This was changed in 2018 to 3500 put requests per second and 5500 get requests
  • Mixed Request Type Workloads
  • A mix of GET, PUT, DELETE, LIST
  • Optimize by NOT using sequential keys as part of the file names
  • Random over sequential
  • This spreads the I/O on different partitions which spreads the I/O
Back to top