Python Lambda function to convert S3 bucket images into PDFs

Created: Thu 31 Aug 2023 Updated: 10 months, 1 week ago

article-featured-image

AWS Lambda is amazing. It's a serverless platform where you don't have to manually manage the underlying compute resources, instead, they scale automatically as required by the code. Using AWS Lambda, a person can stay more focused on the code itself instead of worrying about how or where to deploy the code.

In this article, I'll talk about how you can create your own Lambda function, assign a S3 bucket upload trigger, convert the uploaded Image into PDF file, and remove the image file. You can also upload multiple images in the bucket at the same time. All these images will trigger the Lambda function separately and converts them into PDFs.

Python code to convert image into pdf

I've written a Python Lambda function for this article. If you are just starting out with Lambda, then I advise you to use my code for a better understanding of how Lambda works instead of trying to understand the code itself. But you are most welcome to use your own code as long as you align your S3 bucket and directory names are aligned with your code. Clone the code from my GitHub repository:

Great! you have the code. Now create a new directory in the system and place your lambda_function.py file in inside it. We need to install Python packages that are required for the code to work. Use the below command inside this newly created directory to install packages (make sure you have pip installed in the system. If not, run sudo apt update && sudo apt install python3-pip command to install it first):

$
sudo apt install python3-testresources

$
pip install boto3 pymupdf -t .

It'll install python packages in the current directory. It's the easiest way to setup Lambda function by providing the packages along with the lambda function file. Now your directory structure should look like this:

image-defining-directory-tree

After verifying the directory tree, we need to create a zip file with all these packages and lambda function file.

I'm assuming you know how to create a zip file if you are in Windows environment. Just keep one thing in mind that all these packages and files should be in the root of your zip folder. To create a zip file on Linux, use the below command:

$
zip -r ~/package.zip . *

This command will create a package.zip file in your user's home directory. Later, this zip file will be used in lambda.

Creating S3 bucket

Login to your AWS account and open S3 from services. We will create a new bucket that will be used for saving both uploading images and saving output PDFs.

  • Create a new bucket with sys-tests name.
  • Keep the bucket settings to default.
  • If you are following your own lambda function code, then choose the bucket name accordingly.
  • Select any region you want, just keep in mind that we will be using the same region to create a lambda function.
  • Create two new folders, images and pdfs, inside sys-tests bucket.
  • Inside root of sys-tests bucket, upload package.zip file.
  • After uploading zip file, click on it and copy It's Object URL. It is the direct URL of this file which looks like this: https://sys-tests.s3.us-west-2.amazonaws.com/package.zip
After following the above points, your AWS S3 bucket should look like this: s3-bucket-tree-view

If your S3 bucket contents are same as mine, then congratulations, you have successfully completed the S3 part. Moving on to the AWS Lambda part.

Creating Lambda function

Before starting, make sure you are currently in the same region as your S3 bucket. Open Lambda from services and click on Create function to create new lambda function.

  • Select Author from scratch option.
  • Function name will be the name of out Lambda function. Write sys-test in this field.
  • Runtime field specify the programming language of Lambda function. Select Python 3.8 from the options.
  • Keep everything else on default and click on Create function.

Now open your newly created lambda function and click on Upload from as shown in below image:

upload-code-from-s3-object-url

Select Amazon S3 location from options and a pop will open. Paste the Object URL of package.zip file that we copied in the previous step and click on "save". Now your lambda function has been uploaded from the S3 bucket along with all the required packages.

Configuring Lambda trigger on S3

Invoking the Lambda function from an event trigger is the main feature of Lambda. After opening sys-test Lambda function, click on Configuration and select Triggers from the left-side menu. Triggers panel will open and you can see that currently, we don't have any triggers configured, which means the Lambda function cannot invoke automatically on a certain event. Click on Add trigger to configure new Lambda trigger:

  • Trigger configuration required a source. Open the list of available sources and select S3 to configure S3 trigger.
  • Next we choose the bucket to attach the trigger. In our case, select sys-tests bucket from the list.
  • Event types is the main option here. Select All object create events from the list to make sure our Lambda function will be triggered by any action we do in the bucket.
  • In Prefix, enter the folder name that contains the images. In our case, It's images.

After entering all the required details, click on "Add" button to create the trigger. Now your trigger section should look like the image below:

s3-bucket-view

Now that you have successfully configured S3 event-driven Lambda trigger, your function will be invoked when a file is uploaded in the images/ directory under sys-tests S3 bucket.

Managing IAM role and permission policies for Lambda

Now that everything is set up according to plan, still, if you try to test the project, you get the permission error. Because while creating the Lambda function, a default role is created and attached to the function. This default role does not have any read or write permission over S3 bucket which is why the Lambda function cannot read or write files to the bucket.

In Configuration, select Permissions option from left-side menu that is right under Triggers option. You'll see:

  • The list of Execution roles that grants function permissions to access AWS services and resources.
  • Click on the only role that is present, something looks like this: sys-test-role-9tkovuhy which is the default role. It'll open the role in a new tab.
  • Identity and Access Management page will open which will look like this: IAM-permission-policy

    Click on Add permissions and select Attach policies to grant for services and resources access to this role.

  • Search the list and attach these two policies to the current role: AmazonS3FullAccess and AWSLambda_ReadOnlyAccess. These policies will give the appropriate permissions to the role for performing required operations on the Lambda and S3.
Now that you have successfully assigned the required permissions to the role, It's time to test it.

Test

Go to the images folder of sys-tests S3 bucket and upload image file. If everything worked as planned, you will notice that the image you uploaded is automatically deleted from the images folder.

You can download the file and check It's a healthy PDF that is converted by the Python Lambda function, which was triggered by S3 upload. You can upload multiple images and the Lambda function will convert all images into PDF files.

Conclusion

If you are new to the Lambda function, you might have learned a lot from this article. This article is merely focused on my Python script. But if you decide to use your own script, the same method can be applied with some little changes in the process. This is a basic demonstration of how Lambda function works. Lambda is feature-rich service and you should definitely learn more if find interesting. Check out AWS Lambda Docs to know more about it.

AWS Lambda with S3 trigger
protocolten-admin

Author: Harpreet Singh
Server Administrator

POST CATEGORY
  1. Cloud
  2. Programming
  3. Technology
Suggested Posts:
LINUX post image
How to deploy Django application with Nginx, MySQL, and Gunicorn on Ubuntu

Django is an open-source web framework based on Python. It can be very efficient …

LINUX post image
Install Latest Version of PHP on CentOS 8 and 7


CentOS is great. I have to admit that all those SELinux enforcement and other …

LINUX post image
How to install and remove deb packages in Ubuntu

In this article you will learn about how to install and remove deb packages in …

LINUX post image
Define and use environment variables in Linux & Python

This article is about Environment variablesand their uses in Linux and Python as well. …

CLOUD post image
Setup AWS cross-account RDS MySQL master and slave servers

This article is about AWS RDS master and slave replication. I'll explain how to …

Sign up or Login to post comment.

Sign up Login

Comments (0)