S3 (Simple Storage Service)

What is Object Storage (Object-based Storage)?

data storage architecture that manages data as objects, as opposed to other storage architectures:

  • file systems which manages data as a files and fire hierarchy, and
  • block storage which manages data as blocks within sectors and tracks.

S3 provides you with unlimited storage. You don’t need to think about the underlying infrastructure The S3 Console provides an interface for you to upload and access your data

S3 Object Objects contain your data. They are like files. Object may consist of.

  • Key this is the name of the object
  • Value the data itself made up of a sequence of bytes
  • Version ID when versioning enabled, the version of object
  • Metadata additional information attached to the object You can store data from O Bytes to 5 Terabytes in size

S3 Bucket Buckets hold objects. Buckets can also have folders which in turn hold objects

S3 is a universal namespace so bucket names must be unique (think like having a domain name)

S3 - Storage Classes

Trade Retrieval Time, Accessibility and Durability for Cheaper Storage

Standard (default)fast 99.99% availability 11 9’s Durability. Replicated across at least three AZs
Intelligent TieringUses ML to analyze your object usage and determine the appropriate storage class. Data is moved to most cost-effective access tier, without any performance impact or added overhead.
Standard Infrequently Accessed (IA)Still Fast! Cheaper if you access files less than once a month. Additional retrieval fee is applied. 50% less than Standard (reduced availability)
One Zone IAStill Fast! Objects only exist in one Az. Availability (is 99.5%). but cheaper than Standard lA by 20% less (Reduce durability) Data could get destroyed. A retrieval fee is applied.
GlacierFor long-term cold storage. Retrieval of data can take minutes to hours but the off is very cheap storage.
Glacier Deep ArchiveThe lowest cost storage class. Data retrieval time is 12 hours

Security

All new buckets are Private when created by default

Logging per request can be turned on a bucket Log files are generated and saved in a different bucket. (Even a bucket in a different AWS account if desired)

Access control is configured using Bucket Policies and Access Control Lists (ACL)

Encryption

Encryption In Transit Traffic between your local host and S3 is achieved via SSL/TLS

Server Side Encryption (SSE)- Encryption At Rest Amazon help you encrypt the object data S3 Managed Keys (Amazon manages all the keys) SSE-AES S3 handles the key, uses AES-256 algorithm SSE-KMS Envelope encryption, AWS KMS and you manage the keys SSE-C Customer provided key (you manage the keys)

Client-Side Encryption You encrypt your own files before uploading them to S3

Data Consistency

New Objects (PUTS) Read After Write Consistency

When you upload a new object you are able read immediately after writing

Overwrite (PUTS) or Delete Objects (DELETES) Eventual Consistency

When you overwrite or delete an object it takes time for s3 to replicate versions to AZs.

If you were to read immediately, S3 may return you an old copy. You need to generally wait a few seconds before reading.

Cross Region Replication (CRR)

When enabled, any object that is uploaded will be automatically replicated to another region(s) Provides higher durability and potential disaster recovery for objects

You must have versioning turned on for both the source and destination buckets. You can have CRR replicate to another AWS account

Versioning

  • Store all versions of an object in S3
  • Once enabled it cannot be disabled, only suspended on the bucket
  • Fully integrates with S3 Lifecycle rules
  • MFA Delete feature provides extra protection against deletion of your data

Lifecycle Management

Automate the process of moving objects to different Storage classes or deleting objects all together.

Can be used together with versioning Can be applied to both current and previous versions

Transfer Acceleration

Fast and secure transfer of files over long distances between your end users and an S3 bucket.

Utilizes CloudFront’s distributed a Edge Locations.

Instead of uploading to your bucket, users use a distinct URL for an Edge Location

As data arrives at the Edge Location it is automatically routed to $3 over a specially optimized network path. (Amazon’s backbone network)

Presigned URL

Generate a url which provides you temporary access to an object to either upload or download object data. Presigned Urls are commonly used to provide access to private objects. You can use AWS CLI or AWS SDK to generate Presigned Urls.

aws s3 presign s3://mybucket/myobject --expires-in 300

You have a web-application which needs to allow users to download files from a password protected part of your web-app. Your web-app generates presigned url which expires after 5 seconds. The user downloads the file.

MFA Delete

MFA Delete ensures users cannot delete objects from bucket unless they provide their MFA code.

Only the root user