Listing files from AWS S3 in python using boto3
In this article, we will go through boto3 documentation and listing files from AWS S3. Personally, when I was going through the documentation, I didn’t found a direct solution to this functionality. In this tutorial, we will get to know how to install boto3 and AWS, setup for AWS, creating buckets, and then listing all the files in a bucket.
Boto3
As per the documentation, Boto is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto provides an easy to use, object-oriented API, as well as low-level access to AWS services.
In a nutshell, it is used to connect S3 and Python.
To install Boto3
python2.7- pip install boto
python3.x- pip3 install boto3
Install AWS in python
pip install AWS
If you have an AWS account, you can use that otherwise use localstack.
Let’s configure the AWS account.
aws configure
AWS Access Key ID [None]: yourAccessKeyID
AWS Secret Access Key [None]: yourAccessKey
Default region name [None]: yourRegionName ex.us-west-2
Default output format [None]: json
Creating a bucket in S3
aws s3 mb s3://mybucketoutput-
make_bucket: my-bucket
Now let’s go to the main code
Listing all the buckets in s3
import boto3s3 = boto3.resource('s3')for bucket in s3.buckets.all():
print(bucket.name)output-
my-bucket
my-bucket1
my-bucket2
Listing all Keys(In common language folder and files)
import boto3s3 = boto3.client('s3')
bucket = 'my-bucket'
prefix = 'dir1/sub-dir1/'
for obj in s3.list_objects_v2(Bucket=bucket, Prefix=prefix)['Contents']:
print(obj['Key'])output-
dir1/sub-dir1/s3-file.txt
Reading a file from S3
bucket = 'my-bucket'
prefix = 'dir1/sub-dir1/s3-file.txt'
s3 = boto3.resource('s3')
obj = s3.Object(bucket, prefix)
body = obj.get()['Body'].read()
print(body)Output-
b'This is a sample text file.\n'
Downloading a file from S3
s3 = boto3.client('s3')
bucket = 'my-bucket'
s3_file = 'dir1/sub-dir1/s3-file.txt'
to_be_downloaded_file = 'to_be_downloaded_file.txt'
s3.download_file(bucket, s3_file, to_be_downloaded_file)
In the above method, the first parameter is Bucket you want to read data from, second, the key name (filename with folder) with extension, and third is file name when downloaded to the local system.
Some of you might think, Why I have used client and resource at respective places.
So below is the difference.
Client:
- Low-level AWS service access
- Generated from AWS service description
- Exposes botocore client to the developer
- Typically maps 1:1 with the AWS service API
- All AWS service operations are supported by clients
- Snake-cased method names (e.g. ListBuckets API => list_buckets method)
Resource
- Higher-level, object-oriented API
- Generated from resource description
- Uses identifiers and attributes
- Has actions (operations on resources)
- Exposes subresources and collections of AWS resources
- Does not provide 100% API coverage of AWS services
For detailed information, you can go through this StackOverflow article.
I have tried to keep things simple and sorted. I hope this article will help you.