Supported by the National Science Foundation Collaborator: University of Michigan Collaborator: Michigan State University Collaborator: Wayne State University Collaborator: Indiana University

Getting started with boto

You’ll first need to install boto with pip: pip install boto

There are also python-boto packages available for most Linux distributions.

There are two boto versions: boto2 and boto3. Most of these examples are targeted at boto2. If you prefer to use boto 3 change the command above to ‘pip install boto3’.

You can get your S3 access key and secret key from OSiRIS COmanage under the profile the menu at upper right of the screen. For more information please have a look at our S3 Instructions page. You will need these credentials to run any of the examples below.

Creating a connection to OSiRIS

Using boto in a python script requires you to import both boto and boto.s3.connection as follows:

#!/usr/bin/env python
import boto
import boto.s3.connection

access_key = 'access_key from comanage'
secret_key = 'secret_key from comanage'
osris_host = 'rgw.osris.org'

# Setup a connection
conn = boto.connect_s3(aws_access_key_id = access_key,
  aws_secret_access_key = secret_key,
  host = osris_host,
  is_secure = True,
  port = 443,
  calling_format = boto.s3.connection.OrdinaryCallingFormat(),
)

Listing owned buckets

The following example gets a list of Buckets that you own. It will print the bucket name and the creation date of each bucket.

for bucket in conn.get_all_buckets():
  print "{name}\t{created}".format(
    name = bucket.name,
    created = bucket.creation_date,
  )

The result should look something like this:

mahbuckat1   2011-04-21T18:05:39.000Z
mahbuckat2   2011-04-21T18:05:48.000Z
mahbuckat3   2011-04-21T18:07:18.000Z

Creating a bucket

Creating a new bucket is simple:

bucket = conn.create_bucket('mycou-bucket')

Please note that buckets on OSRIS should start with your COU or ‘virtual organization’ in lowercase. You can find this information under your Profile menu (upper-right after login) in OSiRIS COmanage.

Deleting a bucket

Deleting a bucket is also simple.

bucket = conn.get_bucket('mycou-bucket')
conn.delete_bucket(bucket.name)

Note that the bucket must be empty in order to delete it. There is no way to force a non-empty bucket to be deleted.

Creating an object

This creates a file hello.txt with the string Hello World!

bucket = conn.get_bucket('mycou-bucket')
key = bucket.new_key('hello.txt')
key.set_contents_from_string('Hello World!')

Listing a bucket's content

bucket.list() gets a list of objects in the bucket. The example below prints out the name, file size, and last modified date of each object.

for key in bucket.list():
  print "{name}\t{size}\t{modified}".format(
    name = key.name,
    size = key.size,
    modified = key.last_modified,
    )

Creating an object

This creates a file hello.txt with the string Hello World!

bucket = conn.get_bucket('mycou-bucket')
key = bucket.new_key('hello.txt')
key.set_contents_from_string('Hello World!')

Change an object's ACL

The ACL can be assigned to a file as shown below:

bucket = conn.get_bucket('mycou-bucket')
public_hello = bucket.get_key('hello.txt')
public_hello.set_canned_acl('public-read')
private_hello = bucket.get_key('private_hello.txt')
private_hello.set_canned_acl('private')

A similar process can be used for assigning an ACL to a bucket:

bucket = conn.get_bucket('mycou-bucket')
bucket.set_canned_acl('public-read')

Information on bucket ACLs can be found here

Delete an object

This deletes the object hello.txt

bucket = conn.get_bucket('mycou-bucket')
bucket.delete_key('hello.txt')

Download an object to a file

This example downloads the object hello.txt and saves it in /tmp.

bucket = conn.get_bucket('mycou-bucket')
key = bucket.get_key('hello.txt')
key.get_contents_to_filename('/tmp/hello.txt')

Generate a signed or unsigned object download URL

An unsigned download URL works when a key is publically readable.

bucket = conn.get_bucket('mycou-bucket')
hello_key = bucket.get_key('hello.txt')
hello_url = hello_key.generate_url(0, query_auth=False)
print hello_url

The output will look something like:

https://rgw.osris.org/mycou-bucket/hello.txt

With signed download URLs will work for the time specified (in seconds) even if the object is private, though the URL will stop working when the time period is up.

plans_key = bucket.get_key('secret_plans.txt')
plans_url = plans_key.generate_url(3600, query_auth=True)
print plans_url

The output will look something like:

https://rgw.osris.org/mycou-bucket/secret_plans.txt?Signature=XXXXXXXXXXXXXXXXXXXXXXXXXXX&Expires=1316027075&AWSAccessKeyId=XXXXXXXXXXXXXXXXXXX

Using Server Side Encryption (SSE-C)

We first need to create a header to encrypt the data over the wire. We need three variables: x-amz-server-side-encryption-customer-algorithm, x-amz-server-side-encryption-customer-key, and x-amz-server-side-encryption-customer-key-MD5. The encryption algorithm must be "AES256"; the key must be a 256 bit, base64-encoded encryption key; and the MD5 must be a base64-encoded 128-bit MD5 digest of the encryption key. More information on SSE-C is available from the Amazon S3 Documentation. Here is an example of how to create the header:

# Don't lose this secret, it is required to decrypt your data!
keystring = '32characterSecretStringXXXXXXXX'
key = base64.b64encode(keystring)
md5key = base64.b64encode(hashlib.md5(keystring).digest())

header = {
        "x-amz-server-side-encryption-customer-algorithm":"AES256",
        "x-amz-server-side-encryption-customer-key":key,
        "x-amz-server-side-encryption-customer-key-MD5":md5key
        }

It is your responsibility to store your key string (the variable keystring) somewhere. WE ARE UNABLE TO DECRYPT YOUR DATA! IF YOU LOSE THE KEY, WE WILL BE UNABLE TO ASSIST YOU IN DECRYPTING THE DATA. In the sample script encryption.py, we’ve included the steps to generate a random keystring and store it in a file for re-use.

Uploading a file

We have to pass our upload method the filepath, the header information, and set the encrypt_key to true. We make a copy of our header dict with ul_header = copy(header). This is due to the fact that the header variable content gets modified when used - we make a copy each time so we can re-use our original dictionary for future operations.

ul_header = copy(header)
bucket = conn.create_bucket('testcou-testfile2')
k = bucket.new_key("test.txt")
k.set_contents_from_file(upload_file, headers=ul_header, encrypt_key=True)

There will be a performance difference vs unencrypted objects when reading objects.

Downloading an encrypted file

Downloading a file is very similar to uploading the file, except only the header and filepath are required. Again we make a copy of the header file.

dl_header = copy(header)
bucket = conn.create_bucket('testcou-testfile2')
kn = bucket.new_key("test.txt")
kn.get_contents_to_filename(download_file, headers=dl_header)

Force Encryption Policy

To force all objects uploaded to a bucket to be encrypted, use this json_policy code below:

json_policy = """{
   "Version":"2018-05-24",
   "Id":"PutObjPolicy",
   "Statement":[{
         "Sid":"DenyUnEncryptedObjectUploads",
         "Effect":"Deny",
         "Principal":{
            "AWS":"*"
         },
         "Action":"s3:PutObject",
         "Resource":"arn:aws:s3:::%s/*",
         "Condition":{
            "StringNotEquals":{
               "s3:x-amz-server-side-encryption":"AES256"
            }
         }
      }
   ]
}"""

Then you can set the policy with bucket.set_policy(json_policy % bucket.name)

Sample Script

  • osiris-boto-example.py - This script creates two buckets and an object in each bucket. It then gives an signed URL for one of the objects and an unsigned URL for the other object. The example also shows how to delete an object.
  • encryption.py covers everything with the encryption.

More Information

More examples at ceph.com

S3 Docs for Boto2

Most of these examples are adapted from the docs linked above at ceph.com.

These examples and other examples at ceph.com refer to boto2 but you may be using boto3. Docs for that version are at the URL below:

S3 Docs for Boto3

OSiRIS S3 supports Server Side Encryption with client provided keys (SSE-C). The examples below detail how to use this feature. The full specification for SSE-C is documented here.