Listing files from AWS S3 in python using boto3

Shubham Kanungo
3 min readJun 13, 2020

In this article, we will go through boto3 documentation and listing files from AWS S3. Personally, when I was going through the documentation, I didn’t found a direct solution to this functionality. In this tutorial, we will get to know how to install boto3 and AWS, setup for AWS, creating buckets, and then listing all the files in a bucket.

Boto3

As per the documentation, Boto is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto provides an easy to use, object-oriented API, as well as low-level access to AWS services.

In a nutshell, it is used to connect S3 and Python.

To install Boto3

python2.7- pip install boto

python3.x- pip3 install boto3

Install AWS in python

pip install AWS

If you have an AWS account, you can use that otherwise use localstack.

Let’s configure the AWS account.

aws configure

AWS Access Key ID [None]: yourAccessKeyID

AWS Secret Access Key [None]: yourAccessKey

Default region name [None]: yourRegionName ex.us-west-2

Default output format [None]: json

Creating a bucket in S3

aws s3 mb s3://mybucketoutput- 
make_bucket: my-bucket

Now let’s go to the main code

Listing all the buckets in s3

import boto3s3 = boto3.resource('s3')for bucket in s3.buckets.all():
print(bucket.name)
output-
my-bucket
my-bucket1
my-bucket2

Listing all Keys(In common language folder and files)

import boto3s3 = boto3.client('s3')
bucket = 'my-bucket'
prefix = 'dir1/sub-dir1/'
for obj in s3.list_objects_v2(Bucket=bucket, Prefix=prefix)['Contents']:
print(obj['Key'])
output-
dir1/sub-dir1/s3-file.txt

Reading a file from S3

bucket = 'my-bucket'
prefix = 'dir1/sub-dir1/s3-file.txt'
s3 = boto3.resource('s3')
obj = s3.Object(bucket, prefix)
body = obj.get()['Body'].read()
print(body)
Output-
b'This is a sample text file.\n'

Downloading a file from S3

s3 = boto3.client('s3')
bucket = 'my-bucket'
s3_file = 'dir1/sub-dir1/s3-file.txt'
to_be_downloaded_file = 'to_be_downloaded_file.txt'
s3.download_file(bucket, s3_file, to_be_downloaded_file)

In the above method, the first parameter is Bucket you want to read data from, second, the key name (filename with folder) with extension, and third is file name when downloaded to the local system.

Some of you might think, Why I have used client and resource at respective places.

So below is the difference.

Client:

  • Low-level AWS service access
  • Generated from AWS service description
  • Exposes botocore client to the developer
  • Typically maps 1:1 with the AWS service API
  • All AWS service operations are supported by clients
  • Snake-cased method names (e.g. ListBuckets API => list_buckets method)

Resource

  • Higher-level, object-oriented API
  • Generated from resource description
  • Uses identifiers and attributes
  • Has actions (operations on resources)
  • Exposes subresources and collections of AWS resources
  • Does not provide 100% API coverage of AWS services

For detailed information, you can go through this StackOverflow article.

I have tried to keep things simple and sorted. I hope this article will help you.

--

--