How to Translate PDF Text with Python
Why Translate PDF Text with Python?
The pdfRest Translate PDF API Tool is a powerful resource for developers who need to convert PDF documents into different languages. This tutorial will guide you through the process of sending an API call to the Translate PDF endpoint using Python, allowing you to automate the translation of PDF text with ease.
Imagine you are working in a multinational company that frequently deals with documents in various languages. Using the Translate PDF API, you can quickly translate reports, contracts, or any PDF documents into the desired language, facilitating better communication and understanding across different regions.
Translate PDF Text with Python Code Example
from requests_toolbelt import MultipartEncoder import requests import json # By default, we use the US-based API service. This is the primary endpoint for global use. api_url = "https://api.pdfrest.com" # For GDPR compliance and enhanced performance for European users, you can switch to the EU-based service by uncommenting the URL below. # For more information visit https://pdfrest.com/pricing#how-do-eu-gdpr-api-calls-work #api_url = "https://eu-api.pdfrest.com" endpoint_url = api_url+'/translated-pdf-text' # The endpoint can take a single PDF file or id as input. mp_encoder = MultipartEncoder( fields={ 'file': ('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'), # Translates text to American English. Format the output_language as a 2-3 character ISO 639 code, optionally with a region/script (e.g., 'en', 'es', 'zh-Hant', 'eng-US'). 'output_language': 'en-US', } ) headers = { 'Accept': 'application/json', 'Content-Type': mp_encoder.content_type, 'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' } print("Sending POST request to translated-pdf-text endpoint...") response = requests.post(endpoint_url, data=mp_encoder, headers=headers) print("Response status code: " + str(response.status_code)) if response.ok: response_json = response.json() print(json.dumps(response_json, indent=2)) else: print(response.text)
Source: GitHub
Breaking Down the Code
The code begins by importing necessary libraries: requests_toolbelt
for handling multipart encoding, requests
for making HTTP requests, and json
for parsing JSON responses.
api_url = "https://api.pdfrest.com"
This sets the base URL for the API. By default, it uses the US-based service. For European users, the EU-based service can be used by uncommenting the alternative URL, ensuring GDPR compliance and potentially better performance.
endpoint_url = api_url+'/translated-pdf-text'
The endpoint URL is constructed by appending /translated-pdf-text
to the base API URL. This is the specific endpoint for translating PDF text.
mp_encoder = MultipartEncoder( fields={ 'file': ('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'), 'output_language': 'en-US', } )
The MultipartEncoder
is used to create a multipart form-data payload. The fields
dictionary includes:
file
: The PDF file to be translated, opened in binary read mode.output_language
: Specifies the target language for translation using a 2-3 character ISO 639 code, optionally with a region/script (e.g., 'en', 'es', 'zh-Hant').
headers = { 'Accept': 'application/json', 'Content-Type': mp_encoder.content_type, 'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' }
The headers
dictionary specifies the request headers, including the Api-Key
for authentication, which should be replaced with your actual API key.
response = requests.post(endpoint_url, data=mp_encoder, headers=headers)
This line sends a POST request to the endpoint with the encoded data and headers. The response is stored in the response
variable.
If the request is successful, the response JSON is printed in an indented format. Otherwise, the error message is displayed.
Beyond the Tutorial
In this tutorial, you learned how to send a request to the Translate PDF API using Python. This is just one of the many tools available through pdfRest. To explore more, try out all of the pdfRest API Tools in the API Lab. For further details, refer to the API Reference Guide.
Note: This example demonstrates a multipart API call. For code samples using JSON payloads, visit GitHub.