S3(Simple Storage Services)
AWS S3 stands for Simple Storage Service, S3 is a managed storage service provided by AWS to store contents i.e Files, Audio,Video, Images in the form of objects into buckets(So Bucket is nothing it’s a kind of container for storing the Objects) created in S3.
S3 allows users to upload and download any type of objects from S3 bucket up to maximum 5 TB in size.The user has control over the accessibility of data, i.e. public/private based on your requirements.
Note:
- There is no size of the bucket, it’s a limitless, but there is a limit of an object while you store into bucket i.e. Maximum 5 TB of size on an object.
- By default we can create up to 100 buckets per account, but based on the need we can increase the logging support ticket from AWS support center.
- S3 Bucket names must be unique globally for all AWS accounts but still we need to choose Region because in which Region we want to store the data for fast access for users of AWS-S3.
- S3 Durability of Object retention is 99.999999999% in S3 Buckets.
Features Of S3
- Low Cost
- Scalable
- High Performance
- Secure
How to create a bucket in S3?
Step 1: Now go to your aws console and click on services and then go to S3.
Step 2: Click over create bucket and provide a name for your s3 bucket and remember that your bucket name should be unique. Because it could be possible that any other user might also be having a similar bucket name as yours either in your account or another’s account, it means S3 bucket name must be globally unique. Click on create bucket
After that bucket will be successfully created and will show in the bucket list.
The purpose of the AWS region here while creating s3 bucket is that suppose you are having your ec2 instance created in us-east-1b region so to fetch the data quickly with less latency you should create your s3 bucket in that region itself.
Upload content to S3 Bucket
Step 1: Click on your s3 bucket
Step 2: Click on upload option
Step 3: Browse for the contents that you want to upload in your s3 bucket and click on upload
Step 4: Once you uploaded your content then you can go-ahead and validate your content into S3 Bucket.
Here are some additional features provided by the AWS-S3 given as below:
FEATURES:
- Host Static Website Using S3 Bucket
- Versioning
Step 1: Go to your s3 bucket and click over properties.
Step 2: Click on Static website hosting and select the first option i.e. use this bucket to host your website.
Step 3: In index document on the top screen, provide the file name i.e. index.html and click on save
NOTE:- To make sure your Bucket and its contents should be public so that website can be accessed from outside i.e. from the internet:
Step 1: Go to your s3 bucket
Step 2: Select the item that you want to make public
Step 3: Go to the Actions tab and click on Make Public
Finally, your content will be public
To make your S3 bucket public
Step 1: Go to your S3 bucket
Step 2: Go to your permissions option
Step 3: Go to block public access/p>
Step 4: Click on edit on right side and uncheck all the options and click on save
REMEMBER:- YOUR BUCKET SHOULD HAVE AN INDEX.HTML OR ANY HTML FILE WHICH YOU WANT TO OPEN AFTER CLICKING ON THE URL PROVIDED BY THE S3-STATIC BUCKET. SO ONCE YOU CLICK ON THE URL IT WILL OPEN SAME FILE.
Step 5: Copy the Endpoint url mentioned in blue text in static website hosting
Step 6: Paste that particular endpoint in your browser and see your website content
It is also an important feature of S3 that keeps records of all of your changes and contents in your s3 bucket. If somehow you update the content in your S3 bucket you can get the old one using versioning.
To enable versioning in S3 bucket
Step 1: Go to properties option of your s3 bucket
Step 2: Select versioning from there
Step 3: Click on enable versioning and click on save
Lifecycle Policy In S3 Bucket
We can use lifecycle policies to define actions that we want Amazon S3 to take during an object’s lifetime, we can define a lifecycle policy for all objects or some of objects in the bucket by using a shared prefix name which means names starting with a common string.
To create a lifecycle policy:-
Step 1: Go to your s3 bucket and move to management option
Step 2: Click on Lifecycle
Step 3: Click on Add a lifecycle rule
Step 4: Provide a name for your lifecycle rule
Step 1: Select scope of the lifecycle rule: all objects with a specific prefix name or tag or all the objects in the bucket.
a) To apply this lifecycle rule to all the objects with a specific prefix name or tag, select the limit scope to specific
prefixes or tags. In the add prefix or tag filter box, type the prefix name or tag name, and press Enter.
Step 2: Click on Next and select the versions for which we want to define transitions, current or noncurrent:
a)For transitions that are applied to the current version of the object, choose Current version.
b)For transitions that are applied to all previous versions of the object, choose Previous versions.
For demo purpose we are choosing current version
Step 3: Now move further to add the transition
d)For a current version, under For current object versions, select Add transition.
e)For a non-current version, under For non-current object versions, Select Add transition.
Step 4: For each transition that you add, choose one of the following:
f)Transition to Standard-IA after.
g)Transition to Intelligent-Tiering after.
h)Transition to One Zone-IA after.
i)Transition to Glacier after.
j)Transition to Glacier Deep Archive.
NOTE:- When you choose the Glacier/Glacier Deep Archive storage class, your objects remain in Amazon S3. You cannot access your objects directly through the separate Amazon S3 Glacier service.
Step 5: In the Days after creation box, enter the number of days after the creation of the object that you want the transition to be applied to (for example, 30 or 120 days).
Step 6: When you are done configuring transitions, click on Next.
Step 7: Under Configure expiration, for this example, choose both Current version and Previous versions.
Step 8: Choose the Expire current version of an object, and then fill the number of days after object creation to delete the object (i.e., 395 days).
REMEMBER:- If you choose this expire option then you cannot choose the option for cleaning up expired delete markers.
Step 9: Choose permanently delete previous versions, and then fill the number of days after an object becomes a previous version to permanently delete the object (i.e., 465 or 700 days).
Step 10: To delete incomplete multipart uploads, we recommended that you choose Clean up incomplete multipart uploads and then enter the number of days after the multipart upload initiation that you want to end and check clean up incomplete multipart uploads(for example, 7 or 14 days).
Step 11: Click on Next
Step 12: For Review, check the settings for your rule. Select Save.
If the lifecycle rule does not contain any errors then it is enabled and you can see it on the Lifecycle page.
Performing S3 operations from awscli
What is awscli?
Awscli is the command line interface for amazon web services, where we can perform all the aws operations via cli commands.
Now let’s have a look on an example where we will perform a few s3 operations using the awscli.
For performing this operation we must have an IAM user created with S3 full access, so that on behalf of this user we can create an s3 bucket.
For knowing about IAM and how to create user with appropriate permission please refer to the IAM section (REFERENCE LINK FOR IAM IN AWS)
Installing aws cli on ubuntu
ssh to the ec2 instance and execute the below commands for updating the instance and installing awscli
.
sudo apt update -y && sudo apt install awscli -y
Configuring user on awscli
Now configure the user that we have created on the awscli, execute the below command and enter the secret key and access key
aws configure
S3 awscli commands:-
For creating the bucket, execute the below command
aws s3 mb s3://<bucket-name>
mb stands for make bucket
Now let’s have a look on s3 console weather our bucket is created or not
For copying an object from your local system or ec2, execute the below command:-
aws s3 cp <object-name> s3://<bucket-name>/
For downloading an object from s3 bucket to local system
aws s3 cp s3://<bucket-name>/<object-name> /path/to/download/to/system
Copying an object from one bucket to another bucket
aws s3 cp s3://<bucket-name>/<object-name> s3://<bucket-name>/
Removing an object from s3 bucket
aws s3 rm s3://<bucket-name>/<object-name>
Copying multiple objects to s3 bucket
aws s3 cp /objects/path/ s3://<bucket-name>/ --recursive
Remove an s3 bucket
aws s3 rb s3://<bucket-name>/ --force
rb stands for remove bucket and –force will delete all the folders and subfolders inside the bucket
MultiPart uploads on s3
Why do we need multipart uploads?
Suppose we have a very large size object and if we will upload that particular object directly to s3, so in this case it will take a lot of time, and somehow in worst case scenario if our internet connection is down or internet connection is lost then it may result in upload failure and we have to upload that particular object again from beginning.
So to overcome this situation AWS introduced the concept of multipart uploads, in which the object is divided into multiple parts and one by one that particular part is uploaded and in case of internet failure we have to only upload that particular failed part that fails due to internet issue. Once all the parts are uploaded on S3 they will be combined together as a single object.
Create multipart uploads
For creating multipart upload we must have a user configured in our awscli, in the below example we will be uploading a .mp4 file through multiparts
First let’s split our video in multiple parts, for splitting it execute the below command:-
split -b 18M <object-name> part-
Here 18M means 18mb
Now if we will do ls it will show us the splited parts
Now execute the below command for initializing the process for multi-part
aws s3api create-multipart-upload --bucket <bucket-name> --key <object-name>
This command will give a upload id which is very important, so save this upload id somewhere in text document, as it will be used further
Now we have to start uploading the file that we have splitted in our previous step, execute the below command for uploading first part
aws s3api upload-part --bucket <bucket-name> --key <object-name> --part-number 1 --body <part-name> --upload-id <upload-id>
This command will give us an E-Tag, so save this somewhere in your text document, as it will also be used further.
Now similarly execute the above command again to upload the next part, just replace the part-number and part-name with the new one and save the E-Tag
After uploading the parts if we will have a look on our s3 bucket then it will be still empty because we have uploaded the objects in part and we have not combined them.
For having a look on the uploaded parts we can execute below command
aws s3api list-parts --bucket <bucket-name> --key <object-name> --upload-id <upload-id>
Now create a json file with name file.json and in this json file put all the E-Tags that we have copied in the desired format mentioned below
{ "Parts": [{ "ETag": "5833b49414367e97617e822236604fd6", "PartNumber": 1 }, { "ETag": "00d8da74339eaecd0cb6cd84d9efe98c", "PartNumber": 2 }, { "ETag": "8c66f4d523a3db19448e3056f33c421b", "PartNumber": 3 } ] }
Now let’s complete the multipart upload and combine our parts that we have uploaded, execute the below command
aws s3api complete-multipart-upload --multipart-upload file://file.json --bucket <bucket-name> --key <object-name> --upload-id <upload-id>
Now when we will go to our s3 bucket we will see that our object has been uploaded