AWS S3 using Python

Amazon Simple Storage Service (S3) is the most commonly used Amazon Web Services (AWS) service which provides a highly scalable, reliable, and cost-effective cloud storage solution. Interacting with S3 programmatically is a common task when working with AWS-based applications. In this post, we'll explore different ways to work with AWS S3 using Python, such as, cretae bucket, set and get bucket policy, upload and download file etc.

At the conclusion of the post, I will provide the complete code.


Before we jump into the code, if you're interested in setting up an AWS account, this post will walk you through the process of getting started with AWS.


Prerequisites

For working with AWS S3 using Python, we need some installation and configurations,

  • Setting Up Boto3: Boto3 is the official AWS SDK for Python and allows you to interact with AWS services directly from your Python code. Before we start, ensure that the Boto3 library is installed in your Python environment. You can easily install Boto3 via pip:
pip install boto3
  • AWS Config: To configure AWS settings, add the following lines to the ~/.aws/config (windows: C:\Users\<userId>\.aws\config) file,
[default]
region = eu-west-1
output = json

As an example these settings specify the default AWS region as eu-west-1 and output format as json, you can configure them according your own settings.

  • AWS Credentials: We additionally require AWS credentials, which should be placed in the ~/.aws/credentials (windows: C:\Users\<userId>\.aws\credentials) file. We can obtain these credentials from the AWS console.
[default]
aws_access_key_id=<access-key-id>
aws_secret_access_key=<aws-secret-access-key>
aws_session_token=<aws-session-token>

Note: If you're uncertain about retrieving these credentials, you can refer to the instructions provided here, in section "4 Protect your accounts"

Initializing the S3 Client

Now we're ready to perform S3 operations using Python. First, we need to initialize an S3 client using the Python Boto3 library.. This client acts as an interface to the S3 service and allows to execute various commands, such as creating buckets, uploading files, and managing object permissions.

It is not mandatory, but for better code organization, we'll create a Python class named S3Utilities and implement each S3 operation as a member method within it.

import boto3

class S3Utilities:
    def __init__(self):
        # Initialize the S3 client
        self.s3_client = boto3.client('s3')

List all S3 buckets

An initial step we might take is to list all the buckets within the specified AWS account and region. To accomplish this, we'll define a function named print_all_s3_buckets, which prints all the bucket names,

def print_all_s3_buckets(self):
    try:
      s3_resource = boto3.resource('s3')
      print("List of buckets:")
      for bucket in s3_resource.buckets.all():
          print("\t", bucket.name)
      return True
    except ClientError as e:
      print(f"Failed to list S3 buckets: {e}")
      return False

Use:

s3Utils = S3Utilities()
s3Utils.print_all_s3_buckets()

Creating a Bucket

You can create an S3 bucket using the create_bucket method. Optionally, specify the region to create the s3 bucket,

def create_bucket(self, bucket_name, region=None):
        try:
            if region is None:
                self.s3_client.create_bucket(Bucket=bucket_name)
            else:
                location = {'LocationConstraint': region}
                self.s3_client.create_bucket(Bucket=bucket_name,
                                        CreateBucketConfiguration=location)
            print(f"Bucket '{bucket_name}' created successfully.")
            return True
        except ClientError as e:
            error_code = e.response['Error']['Code']
            if error_code == 'BucketAlreadyOwnedByYou':
                print(f"Bucket '{bucket_name}' already exists.")
                return True
            else:
                print(f"Failed to create bucket '{bucket_name}': {e}")
                return False

Use:

s3Utils = S3Utilities()
s3Utils.create_bucket("my-bucket-name", "eu-west-1")

Set Policy

Now we set a bucket policy like this,

def set_policy(self, bucket_name, bucket_policy):
        try:
            bucket_policy_json = json.dumps(bucket_policy)
            self.s3_client.put_bucket_policy(Bucket=bucket_name, Policy=bucket_policy_json)
            print(f"Policy set successfully for bucket '{bucket_name}'")
            return True
        except ClientError as e:
            print(f"Failed to set policy for bucket '{bucket_name}': {e}")
            return False

Use:

Let's define a bucket policy and pass it as an argument to set_policy(),

bucket_policy = {
  'Version': '2012-10-17',
  'Statement': [{
    'Sid': 'AddPermission',
    'Effect': 'Allow',
    'Principal': '*',
    'Action': [
      's3:ListBucket',
      's3:GetObject',
      's3:PutObject',
      's3:DeleteObject'
    ],
    'Resource': f'arn:aws:s3:::{bucket_name}/*'
  }]
 }
s3Utils = S3Utilities()
s3Utils.set_policy()

Get Policy

Following method will retrieve the policy for the specified bucket_name if it exists

def get_policy(self, bucket_name):
        try:
            result = self.s3_client.get_bucket_policy(Bucket=bucket_name)
            return result['Policy']
        except ClientError as e:
            print(f"Failed to retrieve policy for bucket '{bucket_name}': {e}")
            return None

Use:

s3Utils = S3Utilities()
s3Utils.get_policy(bucket_name= "test-bucket-name")

Uploading a File

You can upload a file to an S3 bucket using the upload_file method. Specify the local file path, bucket name, and optionally, the object name.

def upload_file(self, bucket, file_name, object_name):
    try:
        self.s3_client.upload_file(file_name, bucket, object_name)
        print(f"File '{file_name}' uploaded successfully to bucket '{bucket}' as '{object_name}'")
        return True
    except FileNotFoundError:
        print(f"File '{file_name}' not found.")
        return False
    except ClientError as e:
        print(f"Failed to upload file '{file_name}' to bucket '{bucket}': {e}")
        return False

Use:

s3Utils = S3Utilities() # create an object
s3Utils.upload_file(file_name = "{path-to-the-file}/test-file.txt", bucket = "test-bucket-name", object_name="my-test-files/my-test-file.txt")

Download a File

Similarly, you can download a file from an S3 bucket to your local machine using the following method,

def download_file(self, bucket, object_name, file_name):
    try:
        self.s3_client.download_file(bucket, object_name, file_name)
        print(f"File '{file_name}' uploaded successfully to bucket '{bucket}' as '{object_name}'")
        return True
    except FileNotFoundError:
        print(f"File '{file_name}' not found.")
        return False
    except ClientError as e:
        print(f"Failed to upload file '{file_name}' to bucket '{bucket}': {e}")
        return False

Use:

s3Utils.download_file( bucket = "test-bucket-name", object_name="test-source-files/my-test-file.txt", file_name = ""{path-to-the-file}/test-file.txt")

Complete Code

import boto3
import json
from botocore.exceptions import ClientError

class S3Utilities:
    def __init__(self):
        self.s3_client = boto3.client('s3')

    def print_all_s3_buckets(self):
        try:
            s3_resource = boto3.resource('s3')
            print("List of buckets:")
            for bucket in s3_resource.buckets.all():
                print("\t", bucket.name)
            return True
        except ClientError as e:
            print(f"Failed to list S3 buckets: {e}")
            return False
    
    def create_bucket(self, bucket_name, region=None):
        try:
            if region is None:
                self.s3_client.create_bucket(Bucket=bucket_name)
            else:
                location = {'LocationConstraint': region}
                self.s3_client.create_bucket(Bucket=bucket_name,
                                        CreateBucketConfiguration=location)
            print(f"Bucket '{bucket_name}' created successfully.")
            return True
        except ClientError as e:
            error_code = e.response['Error']['Code']
            if error_code == 'BucketAlreadyOwnedByYou':
                print(f"Bucket '{bucket_name}' already exists and is owned by you.")
                return True
            else:
                print(f"Failed to create bucket '{bucket_name}': {e}")
                return False

    def set_policy(self, bucket_name, bucket_policy):
        try:
            bucket_policy_json = json.dumps(bucket_policy)
            self.s3_client.put_bucket_policy(Bucket=bucket_name, Policy=bucket_policy_json)
            print(f"Policy set successfully for bucket '{bucket_name}'")
            return True
        except ClientError as e:
            print(f"Failed to set policy for bucket '{bucket_name}': {e}")
            return False

    def get_policy(self, bucket_name):
        try:
            result = self.s3_client.get_bucket_policy(Bucket=bucket_name)
            return result['Policy']
        except ClientError as e:
            print(f"Failed to retrieve policy for bucket '{bucket_name}': {e}")
            return None

    def delete_policy(self, bucket_name):
        try:
            self.s3_client.delete_bucket_policy(Bucket=bucket_name)
            print(f"Policy deleted successfully for bucket '{bucket_name}'")
            return True
        except ClientError as e:
            print(f"Failed to delete policy for bucket '{bucket_name}': {e}")
            return False

    def download_file(self, bucket, object_name, file_name):
        try:
            self.s3_client.download_file(bucket, object_name, file_name)
            print(f"File '{file_name}' uploaded successfully to bucket '{bucket}' as '{object_name}'")
            return True
        except FileNotFoundError:
            print(f"File '{file_name}' not found.")
            return False
        except ClientError as e:
            print(f"Failed to upload file '{file_name}' to bucket '{bucket}': {e}")
            return False

    def upload_file(self, bucket, file_name, object_name):
        try:
            self.s3_client.upload_file(file_name, bucket, object_name)
            print(f"File '{file_name}' uploaded successfully to bucket '{bucket}' as '{object_name}'")
            return True
        except FileNotFoundError:
            print(f"File '{file_name}' not found.")
            return False
        except ClientError as e:
            print(f"Failed to upload file '{file_name}' to bucket '{bucket}': {e}")
            return False

That is it!

Conclusion

We've navigated through the process of performing common S3 operations in Python using the Boto3 library. By learning these utilities, you can streamline your interactions with S3, optimizing cloud storage resource management. Whether you're provisioning buckets, managing policies, or facilitating file transfers, Python powers you with the tools to enhance your AWS workflow and drive operational efficiency.

© 2024 Solution Toolkit . All rights reserved.