How to Translate PDF Text with Java

Learn how to translate PDF text to a new language in Java with the pdfRest Translate PDF API
Share this page

Why Translate PDF Text with Java?

The pdfRest Translate PDF API Tool provides a powerful way to convert the text within PDF files into different languages. This tutorial will guide you through the process of sending an API call using Java to the Translate PDF endpoint, enabling you to integrate PDF text translation capabilities into your Java applications seamlessly.

Imagine you are a developer working on an application that processes documents for a global audience. You receive PDF documents in various languages and need to translate them into English for your team. Using the Translate PDF API, you can automate this process, saving time and reducing the potential for human error in manual translations.

Translate PDF Text with Java Code Example

import io.github.cdimascio.dotenv.Dotenv;
import java.io.File;
import java.io.IOException;
import okhttp3.MediaType;
import okhttp3.MultipartBody;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;
import org.json.JSONObject;

public class TranslatedPDFText {

  // By default, we use the US-based API service. This is the primary endpoint for global use.
  private static final String API_URL = "https://api.pdfrest.com";

  // For GDPR compliance and enhanced performance for European users, you can switch to the EU-based
  // service by commenting out the URL above and uncommenting the URL below.
  // For more information visit https://pdfrest.com/pricing#how-do-eu-gdpr-api-calls-work
  // private static final String API_URL = "https://eu-api.pdfrest.com";

  // Specify the path to your file here, or as the first argument when running the program.
  private static final String DEFAULT_FILE_PATH = "/path/to/file.pdf";

  // Specify your API key here, or in the environment variable PDFREST_API_KEY.
  // You can also put the environment variable in a .env file.
  private static final String DEFAULT_API_KEY = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx";

  public static void main(String[] args) {
    File inputFile;
    if (args.length > 0) {
      inputFile = new File(args[0]);
    } else {
      inputFile = new File(DEFAULT_FILE_PATH);
    }

    final Dotenv dotenv = Dotenv.configure().ignoreIfMalformed().ignoreIfMissing().load();

    final RequestBody inputFileRequestBody =
        RequestBody.create(inputFile, MediaType.parse("application/pdf"));
    RequestBody requestBody =
        new MultipartBody.Builder()
            .setType(MultipartBody.FORM)
            .addFormDataPart("file", inputFile.getName(), inputFileRequestBody)
            // Translates text to American English. Format the output_language as a 2-3 character
            // ISO 639 code, optionally with a region/script (e.g., 'en', 'es', 'zh-Hant',
            // 'eng-US').
            .addFormDataPart("output_language", "en-US")
            .build();
    Request request =
        new Request.Builder()
            .header("Api-Key", dotenv.get("PDFREST_API_KEY", DEFAULT_API_KEY))
            .url(API_URL + "/translated-pdf-text")
            .post(requestBody)
            .build();
    try {
      OkHttpClient client = new OkHttpClient().newBuilder().build();
      Response response = client.newCall(request).execute();
      System.out.println("Result code " + response.code());
      if (response.body() != null) {
        System.out.println(prettyJson(response.body().string()));
      }
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
  }

  private static String prettyJson(String json) {
    return new JSONObject(json).toString(4);
  }
}

Source: GitHub Repository

Breaking Down the Code

The code begins by importing necessary libraries such as `io.github.cdimascio.dotenv` for environment variable management, `okhttp3` for HTTP requests, and `org.json` for JSON manipulation.

private static final String API_URL = "https://api.pdfrest.com";

This line sets the API endpoint URL. It defaults to the US-based service, but you can switch to the EU-based service for GDPR compliance.

private static final String DEFAULT_FILE_PATH = "/path/to/file.pdf";

This specifies the default file path for the PDF to be translated. You can override this by providing a file path as a command-line argument.

private static final String DEFAULT_API_KEY = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx";

The API key is needed for authentication. You can set it in the environment variable `PDFREST_API_KEY` or directly in the code.

RequestBody requestBody = new MultipartBody.Builder()
    .setType(MultipartBody.FORM)
    .addFormDataPart("file", inputFile.getName(), inputFileRequestBody)
    .addFormDataPart("output_language", "en-US")
    .build();

This snippet creates a multipart form request body. It includes the PDF file and specifies the output language using a 2-3 character ISO 639 code, with an optional region/script.

Request request = new Request.Builder()
    .header("Api-Key", dotenv.get("PDFREST_API_KEY", DEFAULT_API_KEY))
    .url(API_URL + "/translated-pdf-text")
    .post(requestBody)
    .build();

A `Request` object is built with the API key, URL, and request body. The request is then executed using `OkHttpClient`.

Beyond the Tutorial

In this tutorial, you learned how to use Java to make an API call to the pdfRest Translate PDF endpoint. This process allows you to automate the translation of PDF text into different languages.

To explore more, try out the pdfRest API Tools in the API Lab. For more detailed information, refer to the API Reference Guide. Note that this example demonstrates a multipart API call, and you can find code samples using JSON payloads here.

Generate a self-service API Key now!
Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.