How to Sync Files and Directories to AWS S3 using s3cmd Tool
Abstract: done same as directory except you omit -r s3cmd put test/file1 s3//linoxide/test/file1' -> './test/file1' [2 of 2]
Today we will show how to backup your data to Amazon Web Services. We will use s3cmd, a command line client for Amazon s3 storage. It enables you to create manage and delete buckets from your terminal and to load data from your server.
Install s3cmd on Ubuntu and CentOSFor installing s3cmd on CentOS operating system we first need to add EPEL repository.
yum install epel-release
Then install s3cmd
How to Upload Files to Google Drive...To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video
How to Upload Files to Google Drive Using Python 3 Using Google Drive API V3 Full Projectyum install s3cmd
On Ubuntu, it is inside official repository so we just need to run
apt get install s3cmd
That will get you s3cmd installed on your computer.
Configuring s3cmdNow that we have s3cmd configured, we need to connect it to the AWS account. We assume you are familiar with AWS pricing and have working account, so we wont go trough how to create it. Going straight to config part as root user type:
s3cmd --configure
And then work the prompt as follows, changing the bold for your credentials:
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: ACESSSSSSSSSSSSSKEEEEEEY
Secret Key: 8ujSecret/82xqHWZqT5UzT0OCzUVvKeyyy
Default Region [US]:
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: password
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]:
On some networks all internet access must go through an HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: ACESSSSSSSSSSSSSKEEEEEEY
Secret Key: 8ujSecret/82xqHWZqT5UzT0OCzUVvKeyyy
Default Region: US
Encryption password: password
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0
Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)
Now verifying that encryption works...
Success. Encryption and decryption worked fine :-)
Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'
Now that we are connected to the AWS, we move to next step using it. We will list all the relevant commands.
1. Creating a bucketTo create a bucket in our account, we use s3cmd mb command followed by the url of future bucket.
s3cmd mb s3://linoxide
Bucket 's3://linoxide/' created
Now that we have created it, lets list to see buckets.
2. Listing bucketsTo list all buckets that are currently available type this command
s3cmd ls
2016-12-03 15:52 s3://linoxide
Since it is a fresh account, we only have one that we created a moment ago.
3. Putting a directory into the bucketUploading files and folders is done with put command. Lets create a folder to use on local computer and one file in it.
mkdir test
echo 12345 >> test/file1
Now to put this folder to our S3 bucket, we use put command
s3cmd put -r test s3://linoxide
upload: 'test/file1' -> 's3://linoxide/test/file1' [1 of 1]
6 of 6 100% in 0s 60.62 B/s done
upload: 'test/file1' -> 's3://linoxide/test/file1' [1 of 1]
6 of 6 100% in 0s 41.88 B/s done
Notice the -r in the command, it means recursively, which is needed if we are uploading folders. For files it can be omitted.
4. Uploading filesUploading the files is, like mentioned above, done same as directory except you omit -r
s3cmd put test/file1 s3://linoxide
upload: 'test/file1' -> 's3://linoxide/file1' [1 of 1]
6 of 6 100% in 0s 44.00 B/s done
upload: 'test/file1' -> 's3://linoxide/file1' [1 of 1]
6 of 6 100% in 0s 17.77 B/s done
5. Listing the contents of the bucket
Since we put some data into the bucket, we want to see what is inside. Command number one that we did was for listing all the buckets we have, now we do same command with bucket uri to get the contents of the bucket
s3cmd ls s3://linoxide
DIR s3://linoxide/test/
2016-12-03 17:21 6 s3://linoxide/file1
We have directory test and alongside it we have file1
6. Downloading files and foldersDownloading is done with get command and same as put command, for folder you need to user -r option.
s3cmd get -r s3://linoxide/
ERROR: Parameter problem: File ./file1 already exists. Use either of --force / --continue / --skip-existing or give it a new name.
We see that trouble here is that we already have those files, which is normal, since we just uploaded it from here. Lets clear out the space.
rm -rf file1 test/
After clearing out we can download
s3cmd get -r s3://linoxide/
download: 's3://linoxide/file1' -> './file1' [1 of 2]
6 of 6 100% in 0s 97.99 B/s done
download: 's3://linoxide/test/file1' -> './test/file1' [2 of 2]
6 of 6 100% in 0s 112.95 B/s done
We can download individual directories or files, as well as what we did entire bucket.
7. Deleting files and foldersTo delete the folder you can use del command, no -r marker needed
s3cmd del s3://linoxide/test
To purge all data from the bucket, you need -r and -f (as in force) options
s3cmd del -f -r s3://linoxide/
8. Syncing entire directories
s3cmd supports syncing directories. For example if we have 5 files inside directory test (after doing touch test/file{1..5}
command), we can try to sync that directory.
s3cmd sync --dry-run test/ s3://linoxide
upload: 'test/file1' -> 's3://linoxide/file1'
upload: 'test/file2' -> 's3://linoxide/file2'
upload: 'test/file3' -> 's3://linoxide/file3'
upload: 'test/file4' -> 's3://linoxide/file4'
upload: 'test/file5' -> 's3://linoxide/file5'
WARNING: Exiting now because of --dry-run
It was ran with dry-run option, which means that it will only list files to sync, and not actually sync them. Removing the slash gives us different location, in test folder
s3cmd sync --dry-run test s3://linoxide
upload: 'test/file1' -> 's3://linoxide/test/file1'
upload: 'test/file2' -> 's3://linoxide/test/file2'
upload: 'test/file3' -> 's3://linoxide/test/file3'
upload: 'test/file4' -> 's3://linoxide/test/file4'
upload: 'test/file5' -> 's3://linoxide/test/file5
And removing dry run uploads the files.
9. Syncing with exclusion listsLets add some files with .txt ending
touch test/file{6..8}.txt
They should be there along with other files
ls test/
file1 file2 file3 file4 file5 file6.txt file7.txt file8.txt
Now we want to sync this directory without those files that end in .txt. We do this with following command
s3cmd sync --dry-run test/ --exclude '*.txt' s3://linoxide
Running the same command without dry run marker should get the files excluded and uploaded.
10. Removing the bucketTo delete the bucket, first purge all the data, like we did before:
s3cmd del -f -r s3://linoxide/
And then remove it with rb command which is for removing bucket
s3cmd rb s3://linoxide/
Conclusion
We have installed s3cmd and used it to back up data to the Amazon cloud. It is a quick and easy way to get your data backed up, without having to keep your physical backup with you. That is, if you are ok with Amazon pricing. If not, you might still need to get the additional hard drive and get your data backed up in old-fashioned way. Thank you for reading, this is all.