S3 (Simple Storage Service)
What is Object Storage (Object-based Storage)?
data storage architecture that manages data as objects, as opposed to other storage architectures:
- file systems which manages data as a files and fire hierarchy, and
- block storage which manages data as blocks within sectors and tracks.
S3 provides you with unlimited storage. You don’t need to think about the underlying infrastructure The S3 Console provides an interface for you to upload and access your data
S3 Object Objects contain your data. They are like files. Object may consist of.
- Key this is the name of the object
- Value the data itself made up of a sequence of bytes
- Version ID when versioning enabled, the version of object
- Metadata additional information attached to the object You can store data from O Bytes to 5 Terabytes in size
S3 Bucket Buckets hold objects. Buckets can also have folders which in turn hold objects
S3 is a universal namespace so bucket names must be unique (think like having a domain name)
S3 - Storage Classes
Trade Retrieval Time, Accessibility and Durability for Cheaper Storage
| Standard (default) | fast 99.99% availability 11 9’s Durability. Replicated across at least three AZs |
| Intelligent Tiering | Uses ML to analyze your object usage and determine the appropriate storage class. Data is moved to most cost-effective access tier, without any performance impact or added overhead. |
| Standard Infrequently Accessed (IA) | Still Fast! Cheaper if you access files less than once a month. Additional retrieval fee is applied. 50% less than Standard (reduced availability) |
| One Zone IA | Still Fast! Objects only exist in one Az. Availability (is 99.5%). but cheaper than Standard lA by 20% less (Reduce durability) Data could get destroyed. A retrieval fee is applied. |
| Glacier | For long-term cold storage. Retrieval of data can take minutes to hours but the off is very cheap storage. |
| Glacier Deep Archive | The lowest cost storage class. Data retrieval time is 12 hours |
Security
All new buckets are Private when created by default
Logging per request can be turned on a bucket Log files are generated and saved in a different bucket. (Even a bucket in a different AWS account if desired)
Access control is configured using Bucket Policies and Access Control Lists (ACL)
Encryption
Encryption In Transit Traffic between your local host and S3 is achieved via SSL/TLS
Server Side Encryption (SSE)- Encryption At Rest Amazon help you encrypt the object data S3 Managed Keys (Amazon manages all the keys) SSE-AES S3 handles the key, uses AES-256 algorithm SSE-KMS Envelope encryption, AWS KMS and you manage the keys SSE-C Customer provided key (you manage the keys)
Client-Side Encryption You encrypt your own files before uploading them to S3
Data Consistency
New Objects (PUTS) Read After Write Consistency
When you upload a new object you are able read immediately after writing
Overwrite (PUTS) or Delete Objects (DELETES) Eventual Consistency
When you overwrite or delete an object it takes time for s3 to replicate versions to AZs.
If you were to read immediately, S3 may return you an old copy. You need to generally wait a few seconds before reading.
Cross Region Replication (CRR)
When enabled, any object that is uploaded will be automatically replicated to another region(s) Provides higher durability and potential disaster recovery for objects
You must have versioning turned on for both the source and destination buckets. You can have CRR replicate to another AWS account
Versioning
- Store all versions of an object in S3
- Once enabled it cannot be disabled, only suspended on the bucket
- Fully integrates with S3 Lifecycle rules
- MFA Delete feature provides extra protection against deletion of your data
Lifecycle Management
Automate the process of moving objects to different Storage classes or deleting objects all together.
Can be used together with versioning Can be applied to both current and previous versions
Transfer Acceleration
Fast and secure transfer of files over long distances between your end users and an S3 bucket.
Utilizes CloudFront’s distributed a Edge Locations.
Instead of uploading to your bucket, users use a distinct URL for an Edge Location
As data arrives at the Edge Location it is automatically routed to $3 over a specially optimized network path. (Amazon’s backbone network)
Presigned URL
Generate a url which provides you temporary access to an object to either upload or download object data. Presigned Urls are commonly used to provide access to private objects. You can use AWS CLI or AWS SDK to generate Presigned Urls.
aws s3 presign s3://mybucket/myobject --expires-in 300
You have a web-application which needs to allow users to download files from a password protected part of your web-app. Your web-app generates presigned url which expires after 5 seconds. The user downloads the file.
MFA Delete
MFA Delete ensures users cannot delete objects from bucket unless they provide their MFA code.
Only the root user