PUBLISHED ON - 9 minute read
S3(simple storage service) is one of the core services that AWS (Amazon Web Service) provides and probably one of the most famous. It is a cloud storage service that stores objects (data) with descriptive metadata inside globally uniquely named buckets which are similar to folders on your computer.
Buckets are similar to your folders on you computer which contains your data, files, etc.., using buckets you can store unlimited amount of data in safe place and can be replicate it across multiple AWS availability zones
Objects present your data which is similar to what you store inside your folders on computer, photos, videos, backup of your computer, etc…
Each objects stored inside bucket assembled from :
S3 standard is the most known class across S3 classes, but what about the others?
First of all, all buckets newly created are PRIVATE by default that means now one can access and file on the bucket without allowing it.
S3 buckets can be configured to create access logs and the logs can be stored in completely another bucket or even in another AWS account.
There is multiple method to encrypt S3 data:
Encryption in transit is achieved by SSL/TLS
Encryption at rest (server side) is achieved by multiple methods
Client side encryption which can be achieved by encrypting the object then upload it to S3
S3 by default does not enable versioning for the bucket it should be enabled by the user.
Enable versioning will store all objects including all writes and even if you delete objects. Versioning can not be disabled, only suspended. Suspended versioning means it will stop creating versions for the upcoming objects but will keep all the existing versions. each version has its own public/private access settings. On object deletion all versions will stay but it will get a delete marker and will be hidden from the view, to restore the object basically delete the delete marker.
S3 provide lifecycle rules for buckets, which give you the power to automate moving objects between different storage classes (can be used in conjunction with versioning)
Lifecycle examples:
S3 object lock allows you to store objects using WORM model (write once, read many), it can helpers to prevent objects from being deleted or modified for fixed amount of time or indefinitely, also can be used to meet regulatory requirements or add extra layer of protection from changes and deletions, which is has multiple mods.
Protected object can not be overwritten or deleted by any users, including the root user of the account
Its retention mode can’t be changed and the retention period can’t be shortened.
ensure an object version can’t be overwritten or deleted for the duration of the retention period
Object lock retention period protects an object versioning for a fixed amount of time, so when you place your retention period on an object version, S3 stored a timestamp in the objects version metadata to indicate when the retention period is going to expire ad then after the mention period expires, the object version can be overwritten or deleted, unless you’ve also placed a legal hold on the object’s version.
Legal hold prevent version object from being overwritten or deleted, legal hold has no retention period, it remains in effect until its removed by any user who has the S3:PutObjectLegalHold
permission
AWS S3 has extremely low latency, you get the first byte out of S3 within 100-200 milliseconds. S3 can apply 3500 PUT/COPY/POST/DELETE and 5500 GET/HEAD requests per second per prefix, prefix is the middle area between the bucket name and the object name.
** Tips to improve S3 performance **
Spread the read across the prefixes → the more prefixes we have the better performance we get
Uploading
Downloads
Keep in mind SSE-KMS (Server-Side Encryption - Key management system) limits when you encrypt the object, when you upload a file you will call GenerateDataKey
in the KMS API and when you download a file you will call decrypt in the KMS API.
Uploading/downloading will count towards the KMS quota. And the quota is region specific, however it’s either 5500, 10000, 30000 requests per second. Currently you can not request quota increase for KMS
S3 transfer acceleration utilized the cloud front edge network to accelerate your uploads to S3, which means instead of uploading directly to S3 bucket, you can use distinct URL to upload directly to a edge location which will the transfer the files directly to S3 bucket, the distinct url should be similar to xxxxxx.s3-accelerate.amazonaws.com
S3 select enables application to retrieve only a subset of data from an object by using simple SQL expressions, select will give you the power to get only the data needed for the application which can lead to performance increase in many cases up to 400% improvement and 80% cheaper. In the other hand glacier select is used by high regulated companies (financial services, healthcare, others), write data directly to Amazon Glacier to satisfy compliance needs like SEC rule 17a-4 or HIPAA. Glacier select allows you to run SQL queries against glacier directly.
There are multiple options to move your data to S3 other than basic uploading.
DataSync
agent is deployed as an agent one a server and connected to your NAS or file-system to copy data to AWS and write data from AWS, it will automatically encrypt data and accelerate transfer over the Wan, also performs automatic data integrity checks in-transit and at-rest.
Snowball
is a petabyte-scale data transport solution that users secure appliances to transfer large amounts of data into and out of AWS. Snowball is 50TB or 80TB size, it uses multiple layers of security to protect you data including tamper-resistant enclosures, 256-bit encryption, and an industry-standard trusted platform module (TPM) designed to ensure both security and full chain-of-custody of your data. Once the data transfer job has been processed and verified, AWS performs a software ensure of the snowball appliance.
Snowball edge
is a 100TB data transfer device with on-board storage capabilities.
Snowmobile
is an exabyte-scale data transfer service used to move extremely large amounts of data to AWS. You can transfer up to 100PB snowmobile, a 45-foot long ruggedized shipping container, pulled by semi-trailer truck.