S3
Simple Storage Service (S3)
Give devs and IT team secure, durable, scalable object storage
Not for OS and database
Data spread across multiple devices and facilities so these are HA
Object based
Objects can be 0 Bytest to 5 TB
Larges size file that can transfer to S3 using PUT is 5 GB
Unlimited Storage
Files are stored in Buckets (similar to folder)
S3 is a universal namespace (means they are a global service)
When upload is successful you recieve a HTTP 200 status code
Data consitency
Read after Write consistency for PUTS of new Objects
Means that as soon as file is uploaded it is available to read
Eventual Consistency for overwrite PUTS and DELETES
Means can take some time for changes to propegate
S3 Under Hood
S3 is object based and objects conists of the following
Key (This is the name of an object)
Value (This is the data, sequence of bytes)
Version ID (Allows for versioning and rolling back)
Metadata (Data about the data)
Subresources - Bucket specfic configurations
Bucket Policies, access control lists
Cross Origin Resource Sharing (CORS)
Transfer Acceleration
Allows for increase transfer speed
The Basics
Built for 99.99% availability for S3 platform
Amazon Gurantees 99.9% availability
Amazon Gurantees 99.999999999% (the 11 9's) durability for S3 information
This means they expect you to lose 1 file ever 10,000 years
Tiered Storage Available
Lifecycle Management
Versioning
Encryption
Secure data access with Access Control lists
Storage Tiers/Classes
S3 (standard)
99.99% availability
11 9s durability
Redundancy exists across multiple devices in multiple facilities and is designed to sustain loss of 2 facilities concurrently
S3 IA (Infrequently Accessed)
Data that is access less frequently but requires rapid access when needed
Lower fee than S3 but chared with access data
S3 One Zone IA
Same as IA however data is stored in a single AZ
Still contains 11 9s durability
Only 99.5% availability
Cost 20% less than regular S3 IA
S3 Reduced Redundancy Storage
Designed with only 99.99% durability
99.99% Availability
Meant for data that is easily recreatable is lost
Not offered in some regions
AWS recommends NOT using this class
Recommended to use S3 standard
S3 standard price was reduce to be lower than this to help move people off
S3 Glacier
Designed for Data that is infrequently accessed
Very Cheap
Takes 3-5 hours to retrieve data
S3 Deep Archive Glacier
Glacier but more infrequent
12 hour retrieval time
S3 Intelligent Tiering
Designed for data with unknown or unpredictable access patterns
2 tiers
frequent
infrequent
Objects are automatically moved around between tiers depending on access patterns
11 9s durability
99.99% accessibility
Small Monthly fee for monitoring/automation
Charges
Storage per GB
Requests (GET, PUT, COPY, ect)
Storage Mangement Pricing
Inventory, Analytics, Object Tags
Data Management Pricing
Free transfer in
Charge for transfer in
Transfer Acceleration
Use of CloudFront to optimize speed
S3 ACLs & Policies
Access Control List ACL
Per file/object basis
By default only access to owner of file
Fine grain access control
Bucket Policies
Documents that govern what settings for a given resource
JSON objects
Can use policy generator so you dont have to write JSON
Will generate JSON code that you can copy and paste into policy editor
Sometimes the policy generator will need to have a wildcard operator added to the resource
S3 Encryption
Encryption
In Transit (Encrypt data being sent to and from bucket)
At Rest (Encrypt data that is in bucket)
Server Side Encryption
S3 Managed Keys - SSE-S3
Amazon does everything
Encrypts data
Encrypts Keys
AWS Key Management Service, Managed Keys, SSE-KMS
Get a seperate key called "Envelope Key"
This is key that encrypts the key
Get Audit trail to see when and who has used encryption key
Can use your own key
Server side Encryption with customer Provided Key
AWS manages encryption and decryption
User manages their own keys
Client Side Encryption
User encrypts files themselves before uploading files into S3
Enforcing Encryption
Every upload sends a PUT request
Request header contains information for S3 to understand what needs to be done
x-amz-server-side-encryption parameter included in request header if file needs encrypted at upload time
Two options available
x-amz-server-side-encryption: AES256 (SSE-S3 - S3 managed keys)
x-amz-server-side-encryption: ams:kms (SSE-KMS - KMS managed keys)
In order to enforce encryption all we need to do is make a bucket policy that denies any S3 PUT requests which do not include the x-amz-server-side-encryption
How to enable
Can be done via the the console (which is the easy way)
Can be done with an S3 Bucket Policy (Written in JSON)
Cross-Access-Resource-Sharing (CORS)
Allows access from one resource (bucket) to another
If you open to all you leave your resources exposed and vulnerable to attack
S3 Perfomance Optimization
(This has been deprecated advice since AWS increased S3 performance in 2018)
By default S3 is designed to handle very high request rates
If you are recieving over 100 PUT/LIST or 300 DELETE per second then you need to optimize
This was changed in 2018 to 3500 put requests per second and 5500 get requests
Mixed Request Type Workloads
A mix of GET, PUT, DELETE, LIST
Optimize by NOT using sequential keys as part of the file names
Random over sequential
This spreads the I/O on different partitions which spreads the I/O
Back to top