Translate PDF API Tool

Translate PDF

Pro

Translate PDF is an AI-powered REST API Tool that leverages OpenAI technology to intelligently translate the language of the text found within any PDF, Markdown, or Plain Text file. Designed to maintain accuracy and context across languages, this tool extracts content and translates the text into your specified target language.

Key Benefits of Translate PDF API

  • Instantly convert technical documents, legal contracts, and reports into virtually any language supported by OpenAI's advanced models, enabling global accessibility.
  • Automatically detect and translate the entire text content of a PDF, ensuring minimal loss of context and preserving the document's original structure.
  • Restrict translation to a specific page or range of pages for efficient processing of very large or multi-section documents.
  • Easily process PDF, Markdown, or Plain Text files and receive the translated content as structured Markdown or raw Plain Text output.
  • Choose to receive the translated content directly within the JSON response or as a separate downloadable file for flexible, high-volume application workflows.
Pro
What are Pro Tools?
Pro Tools are a suite of advanced and specialized API tools designed to tackle more complex document processing challenges. These powerful features, offering enhanced capabilities, are included with Pro and Enterprise plans. Premium plan users can also access Pro Tools on a per-call basis, allowing flexible access to premium functionalities when needed.
Build Your Solution

You have document processing problems, we have Solutions. Explore the many ways pdfRest can align your documents with your business objectives.

Browse all solutions
The pdfRest logo is added to the Microsoft Power Automate logo with a representation of a PNG to PDF conversion workflow
Integrate pdfRest with Microsoft Power Automate
Streamline Global E-Discovery by Translating Foreign Legal Documents
Streamline Global E-Discovery by Translating Foreign Legal Documents
Ensure GDPR Compliance for PDF Processing with EU-Based Cloud API
Ensure GDPR Compliance for PDF Processing with EU-Based Cloud API
The Salesforce logo with APEX programming language is connected with the pdfRest logo around a PDF toolkit icon
Integrate PDF API Tools with Salesforce Apex Code
Automate Multi-Language Knowledge Base Creation from Product Manuals
Automate Multi-Language Knowledge Base Creation from Product Manuals
Why is pdfRest the best API to translate PDF text?
pdfRest offers the best solution for PDF text translation because it delivers fast, contextual language conversion, ensures reliable content extraction from complex documents, and offers seamless integration into global workflows.

Deliver Contextual Language Conversion with OpenAI

The Translate PDF API Tool provides reliable and context-aware language conversion by leveraging powerful OpenAI models. Unlike simple text-in, text-out translation services, our tool is built specifically for documents, ensuring the translation is accurate and preserves the intent and context of the original text. You gain total control over the language conversion process:

  • Broad Language Support: Easily translate the text between virtually any common language, allowing you to serve global user bases and localize content without complex infrastructure.
  • Structured Output: The output maintains structure whether you choose Markdown (for web rendering) or Plain Text (for database ingestion), ensuring the translated content is clean and ready for immediate use.
  • Dual-Language Workflows: The tool pairs perfectly with the Summarize PDF API, enabling advanced workflows where a document is summarized first, and that summary is then translated into a different target language.

This focus on contextual, structured translation ensures your content is globally accessible and perfectly accurate for your application's needs.

Ensure Reliable Content Extraction Across PDF Documents

Accurate and high-quality translation begins with flawless text extraction. The Translate PDF API Tool utilizes proprietary technology to convert the document's text content to Markdown before sending it to the AI for translation. This crucial step ensures the AI receives content that retains its original structure, leading to a significantly richer contextual understanding of the source material.

Our multi-stage extraction process guarantees translation precision:

  • Robust Pre-Processing: The API masterfully handles all the difficulties of PDF parsing, ensuring the AI receives a complete and accurate transcription of the document's content with preserved structure.
  • Targeted Translation: Use the pages parameter to limit the translation to a specific range. This control ensures you only translate the relevant body content, excluding irrelevant pages like legal disclaimers or cover pages.

This comprehensive, multi-stage process is the foundation for high-integrity AI input, ensuring every translation delivered is accurate, contextually relevant, and reliable.

Streamline Developer Workflows with Seamless Integration

The Translate PDF API Tool is engineered for efficiency, offering features that simplify integration and content delivery into automated, high-volume applications and global workflows. These controls minimize the need for post-processing and ensure smooth data handling on your end.

Key developer-focused integration features include:

  • Simplified Delivery: Choose the output_type to receive the translated text directly within the JSON response for immediate integration, or receive a secure file download URL for larger outputs.
  • Input Flexibility: The API accepts input as a PDF file, raw Markdown, or Plain Text, providing flexibility for integration into various stages of your existing document processing pipeline.
  • Efficiency in Chains: The tool accepts either a direct file upload or a resource ID, simplifying complex, multi-step workflows where the document has already been uploaded to pdfRest for a previous processing step (like OCR or extraction).

These features significantly reduce the development time and complexity required, allowing you to focus on application logic while the API handles the resource-intensive tasks of extraction and language translation.

Start from Code Examples
See more code examples in our GitHub repository
Customize Your Solution

Learn about the parameters for this tool to create your custom solution.

Output Language

The output_language parameter specifies the target language for the text translation.

This parameter controls the language into which the content of the PDF, Markdown, or Plain Text file will be translated.

The language must be provided as a standard IETF BCP 47 language tag, which typically follows the format: {language code}-{subtag}.

Components of the Language Tag:

  • Language Code: The primary language identifier (2 or 3 letters, ISO 639). Examples: en (English), zh (Chinese).
  • Script Subtag (Optional): Specifies a writing script (4 letters, ISO 15924). Examples: Latn (Latin), Cyrl (Cyrillic), Hant (Traditional Han).
  • Region Subtag (Optional): Specifies a country or regional dialect. This can be a 2-letter country code (e.g., US, BR) or a 3-digit numeric region code (e.g., 419 for Latin America).

Examples of valid output_language values:

  • es (Spanish)
  • zh-Hant (Chinese, Traditional Script)
  • en-GB (English, United Kingdom)
  • pt-BR (Portuguese, Brazil)
  • fr-419 (French, Latin America)

Safe & Secure

Confidently process your sensitive data with pdfRest. Our platform is fortified for robust, Enterprise-grade security and compliance, including GDPR, HIPAA, and SOC 2 Type 2 certification. Your data's protection is our priority.

Frequently Asked Questions
Need more help? Contact Us or visit our documentation.

The Translate PDF API is an AI-powered REST API Tool that uses OpenAI technology to intelligently translate the language of the text found within any PDF, Markdown, or Plain Text file. It is designed for global accessibility, accurately converting content into your specified target language while preserving context.

Yes, the Translate PDF API is categorized as a Pro Tool, which is part of a suite of advanced and specialized APIs. This tool is included with Pro and Enterprise plans. Premium plan users can also access it with a per-call fee.

The API is versatile and accepts three main file types: PDF, Markdown (.md), and Plain Text (.txt) files.

The API ensures high-quality translation through robust pre-processing. First, it converts the input document's text into a structured Markdown format, which preserves the original structure (like headings and lists). This allows the OpenAI model to receive richer, more organized input, leading to a significantly more accurate and contextually relevant translation.

The API supports translation to and from virtually any common language supported by OpenAI's advanced models. You must specify the target language using a valid IETF BCP 47 language tag (e.g., en, es-419, zh-Hant).

You use the output_language parameter, which requires a standard IETF BCP 47 language tag (ISO 639 code) and optional subtags:

  • Language Code: The primary language identifier (e.g., en for English, ja for Japanese).
  • Optional Script Subtag: For different writing scripts (e.g., Hant for Traditional Chinese: zh-Hant).
  • Optional Region Subtag: For regional dialects (e.g., BR for Brazilian Portuguese: pt-BR, or 419 for Latin America: fr-419).

Yes, you can restrict the translation using the pages parameter. You can specify a single page number (e.g., 5) or a page range (e.g., 1-5, 10-last) for PDF documents. This parameter is ignored for Markdown and Plain Text files.

The API supports two formats for the final translated text, controlled by the output_format parameter: plaintext and markdown. Choosing markdown is recommended as it helps preserve the structure of the translated document.

You have flexible control over the output delivery using the output_type parameter:

  1. json (Default): The translated text, along with the detected source languages, is included directly within the JSON response.
  2. file: A secure download URL is provided, allowing you to retrieve the translated content as a separate downloadable file.

No. Your files and any data you provide are never used for AI training or shared with third parties. We partner with OpenAI through their API, and as stated in their privacy policy, they do not train models on data used through their API.

Ensuring the security and privacy of your data is a top priority at pdfRest. Our platform is built for robust, enterprise-grade security and compliance, including GDPR and HIPAA. All your files are secured with encryption during both transit and at-rest, and they are permanently deleted after the stated file retention period (30 minutes for most plans).

To facilitate GDPR compliance for your translation workflows, pdfRest processes your data within the European Union and adheres to other strict data protection requirements. You can ensure all processing occurs within the EU by sending your API calls to the dedicated EU endpoint. Please note that a GDPR usage fee may apply for some plans.

Integrating the Translate PDF API is straightforward. We offer comprehensive API documentation and code samples in many programming languages. The API Lab also allows you to test and generate code snippets directly from your browser, simplifying the setup and ensuring a smooth integration experience.

No, the Translate PDF API is not available in self-hosted versions of our product. This tool relies on calling out to the OpenAI API, and our self-hosted products only support fully self-contained processing capabilities. To use PDF Translation, you must use our Cloud API service.

Generate a self-service API Key now!
Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.