How to Merge PDF Documents with Python, Tutorial

Share this page

Why Use Merge PDFs with Python?

The pdfRest Merge PDFs API Tool is a powerful feature that allows you to combine multiple PDF documents into a single PDF file. This tutorial will guide you through the process of making an API call to the Merge PDFs endpoint using Python.

Merging PDFs can be useful in various real-world scenarios, such as combining scanned documents, consolidating reports, or assembling a portfolio of work.

Merge PDFs with Python Code Example

from requests_toolbelt import MultipartEncoder
import requests
import json

merged_pdf_endpoint_url = 'https://api.pdfrest.com/merged-pdf'

# The /merged-pdf endpoint can take one or more PDF files or ids as input.
# This sample takes 2 PDF files and merges all the pages in the document into a single document.

merge_request_data = []

# Array of tuples that contains information about the 2 files that will be merged
files = [
    ('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'),
    ('file_name2.pdf', open('/path/to/file', 'rb'), 'application/pdf')
]

# Structure the data that will be sent to POST merge request as an array of tuples
for i in range(len(files)):
    merge_request_data.append(("file", files[i]))
    merge_request_data.append(("pages", "1-last"))
    merge_request_data.append(("type", "file"))

merge_request_data.append(('output', 'example_mergedPdf_out'))

mp_encoder_mergedPdf = MultipartEncoder(
    fields=merge_request_data
)

# Let's set the headers that the merged-pdf endpoint expects.
# Since MultipartEncoder is used, the 'Content-Type' header gets set to 'multipart/form-data' via the content_type attribute below.
headers = {
    'Accept': 'application/json',
    'Content-Type': mp_encoder_mergedPdf.content_type,
    'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' # place your api key here
}

print("Sending POST request to merged-pdf endpoint...")
response = requests.post(merged_pdf_endpoint_url, data=mp_encoder_mergedPdf, headers=headers)

print("Response status code: " + str(response.status_code))

if response.ok:
    response_json = response.json()
    print(json.dumps(response_json, indent = 2))
else:
    print(response.text)

# If you would like to download the file instead of getting the JSON response, please see the 'get-resource-id-endpoint.py' sample.

The source of the provided code is available on GitHub at pdf-rest-api-samples.

Breaking Down the Code

The provided code performs the following steps:

from requests_toolbelt import MultipartEncoder
import requests
import json

This imports the necessary libraries: requests_toolbelt for creating a multipart encoder, requests for making HTTP requests, and json for JSON processing.

merged_pdf_endpoint_url = 'https://api.pdfrest.com/merged-pdf'

Defines the API endpoint URL for merging PDFs.

merge_request_data = []

Initializes an empty list to store the request data for the merge operation.

files = [
    ('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'),
    ('file_name2.pdf', open('/path/to/file', 'rb'), 'application/pdf')
]

Creates a list of tuples representing the files to be merged. Each tuple contains the file name, a file object opened in binary read mode, and the MIME type.

for i in range(len(files)):
    merge_request_data.append(("file", files[i]))
    merge_request_data.append(("pages", "1-last"))
    merge_request_data.append(("type", "file"))

Iterates over the files, appending the file data, page range, and type to the request data list.

merge_request_data.append(('output', 'example_mergedPdf_out'))

Appends the desired output file name to the request data.

mp_encoder_mergedPdf = MultipartEncoder(
    fields=merge_request_data
)

Creates a MultipartEncoder object with the request data, which helps in encoding the files for the HTTP request.

headers = {
    'Accept': 'application/json',
    'Content-Type': mp_encoder_mergedPdf.content_type,
    'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
}

Sets the HTTP headers, including the API key which you must replace with your actual API key.

response = requests.post(merged_pdf_endpoint_url, data=mp_encoder_mergedPdf, headers=headers)

Makes a POST request to the API endpoint with the encoded data and headers.

if response.ok:
    response_json = response.json()
    print(json.dumps(response_json, indent = 2))
else:
    print(response.text)

Checks the response status and prints the JSON response if successful, or the error message if not.

Beyond the Tutorial

The tutorial has demonstrated how to use Python to call the pdfRest Merge PDFs API to merge multiple PDF documents into a single file. You are encouraged to explore and demo all of the pdfRest API Tools in the API Lab and refer to the API Reference documentation for more details.

Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at pdf-rest-api-samples.

How to Merge PDF Files with Python

Why Use Merge PDFs with Python?

Merge PDFs with Python Code Example

Breaking Down the Code

Beyond the Tutorial