The multipart upload API enables you to upload large objects in parts

Multipart upload is a three-step process: You initiate the upload, you upload the object parts, and after you have uploaded all the parts, you complete the multipart upload. Upon receiving the complete multipart upload request, Amazon S3 constructs the object from the uploaded parts, and you can then access the object just as you would any other object in your bucket.

In this post we going to use the AWS CLI to upload a multi-part file to a s3 bucket.

Requirements

  • AWS Account
  1. Create a s3 Bucket, go to S3

2. Click on Create bucket

3. Give it a name and select the Region

4. Click on Create bucket

5. Go to IAM

6. Click on Users

7. Add user

8. Give it a name and check the programmatic access type and click on Next

9. In the next screen select Attach existing policies directly, and for this lab, we going to add the s3AmazonFullAccess, search for it, and check that permission, then click on Next

10. In the next window click Next

11. Review the information and click Create user

12. In the next window copy the Access key ID and the Secret Access key (click on show) to used it in the CLI

13. Install AWS CLI for your OS from this link

14. Once you install the AWS CLI open a terminal and check the version

aws --version
aws-cli/2.1.10 Python/3.7.3 Linux/5.7.1-050701-generic exe/x86_64.ubuntu.20 prompt/off

15. Run aws configure, and put the information of the user we create, and put the Region where we create the bucket

aws configure
AWS Access Key ID [****************QRE5]: AKIAZDD5XRMMIT7YWL2J
AWS Secret Access Key [****************QMy/]: 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
Default region name [us-west-2]: us-east-2
Default output format [json]:

16. Check that CLI can see the bucket

aws s3 ls
2020-12-10 20:28:33 multi-part-upload-lab

17. If you can see the bucket everything is working, now lets create a fake file to use in a multi-part upload, the first step is to have a file to upload, for this lab I'm going to create one of 40 MB aprox with the command dd

dd if=/dev/urandom of=bigfile.txt bs=1M count=40
40+0 records in
40+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 9.48438 s, 4.4 MB/s

18. Now lets split the file and upload to s3, all this work can be automated with a script, but I want to show you how to do it step by step

Lets split the file in three parts with the split command

split -b 18M bigfile.txt part-

19. Check the files

du -sh part*
18M	part-aa
18M	part-ab
4.0M	part-ac

20. Now to make a multipart-upload we need to follow three steps, the first one is to tell s3 that we want to make a multi-upload part, we do it with the command create-multipart-upload

aws s3api create-multipart-upload --bucket multi-part-upload-lab --key bigfile.txt

Where

  • --bucket is for indicate the name of the s3 bucket
  • --key is the name of the file

Once you execute the command you will get a json like this

{
    "Bucket": "multi-part-upload-lab",
    "Key": "bigfile.txt",
    "UploadId": "xUSzTLCzvdQA3pu4TX8GyApwoTmi5EckEkh8laozOjLF7DGym4Komt2ikc8xh7z1NwVbyZYQi7OyULTfc_lIWWWh8SVqrnldQ4EquLFD3RhhNDhvY1JbhouuDKqRIeT4"
}

21. The next step is to upload the parts, we going to use the UploadId the last command give to us

For upload each part we going to use the command upload-part

aws s3api upload-part --bucket multi-part-upload-lab --key bigfile.txt --part-number 1 --body part-aa --upload-id xUSzTLCzvdQA3pu4TX8GyApwoTmi5EckEkh8laozOjLF7DGym4Komt2ikc8xh7z1NwVbyZYQi7OyULTfc_lIWWWh8SVqrnldQ4EquLFD3RhhNDhvY1JbhouuDKqRIeT4

Where

  • --bucket is for indicate the name of the s3 bucket
  • --key is the name of the file
  • --part-number the part number starting from 1
  • --body to indicate the name of the part
  • --upload-id the UploadId that the command create-multipart-upload give to us

After you execute the command you going to get a json with an Etag like this one

{
    "ETag": "\"369a52442b21748180402fb758035499\""
}

Upload each part

22. If you want to see the content or how many parts are in the multi-part upload, you can use the command list-parts

aws s3api list-parts --bucket multi-part-upload-lab --key bigfile.txt --upload-id xUSzTLCzvdQA3pu4TX8GyApwoTmi5EckEkh8laozOjLF7DGym4Komt2ikc8xh7z1NwVbyZYQi7OyULTfc_lIWWWh8SVqrnldQ4EquLFD3RhhNDhvY1JbhouuDKqRIeT4
{
    "Parts": [
        {
            "PartNumber": 1,
            "LastModified": "2020-12-11T03:33:04+00:00",
            "ETag": "\"369a52442b21748180402fb758035499\"",
            "Size": 18874368
        },
        {
            "PartNumber": 2,
            "LastModified": "2020-12-11T03:34:31+00:00",
            "ETag": "\"f97a14373eca07f1f33eaad85e75b5cb\"",
            "Size": 18874368
        },
        {
            "PartNumber": 3,
            "LastModified": "2020-12-11T03:41:57+00:00",
            "ETag": "\"6d68fd86a0d074baa86f5efa76677bd2\"",
            "Size": 4194304
        }
    ],
    "Initiator": {
        "ID": "arn:aws:iam::625181428504:user/multi-upload",
        "DisplayName": "multi-upload"
    },
    "Owner": {
        "ID": "f80e87e725ee46fb70b95e3e1c6019a78cdc2888a5aa234d5dd0926f330b938f"
    },
    "StorageClass": "STANDARD"
}

23. Now the final step is to use the command complete-multi-part-upload, for this we need to confirm each Etag and part to s3, create a json file with this content.

nano parts.json

and paste this content and add your information

{
"Parts":[{
"ETag": "\"369a52442b21748180402fb758035499\"",
"PartNumber": 1
},
{
"ETag": "\"f97a14373eca07f1f33eaad85e75b5cb\"",
"PartNumber": 2
},
{
"ETag": "\"6d68fd86a0d074baa86f5efa76677bd2\"",
"PartNumber": 3
}]
}
aws s3api complete-multipart-upload --multipart-upload file://file.json --bucket multi-part-upload-lab --key bigfile.txt --upload-id xUSzTLCzvdQA3pu4TX8GyApwoTmi5EckEkh8laozOjLF7DGym4Komt2ikc8xh7z1NwVbyZYQi7OyULTfc_lIWWWh8SVqrnldQ4EquLFD3RhhNDhvY1JbhouuDKqRIeT4
{
    "Location": "https://multi-part-upload-lab.s3.us-east-2.amazonaws.com/bigfile.txt",
    "Bucket": "multi-part-upload-lab",
    "Key": "bigfile.txt",
    "ETag": "\"70fd10aa9fbe6f234515104e07c98374-3\""
}

24. Check the bucket, in AWS Management Console, go to S3

25. Click on the bucket name

26. And you will see the file in the Objects tab

Great! you now know how to make multi-part uploads, in the next posts we going to go deeper in the AWS Services.

Clean-UP

  1. Go to S3

2. Select the bucket and click on Empty

3. Confirm

4. Select the bucket and click on Delete

5. Confirm

6. Go to AWS IAM

7. Go to Users

8. Click on Users

9. Select the user we create for the lab and click on Delete user

10. Confirm and delete

11. Remove CLI configuration

rm -R ~/.aws/credentials

Reference: