How to Redact PDF Text with PHP

Learn how to redact text on a PDF document using pdfRest Redact PDF API tool with PHP.
Share this page

Why Redact PDF Text with PHP?

The pdfRest Redact PDF API Tool is a powerful solution for developers looking to automate the process of redacting sensitive information from PDF documents. By integrating this tool with PHP, developers can efficiently send API calls to redact text in PDFs, making it an ideal choice for applications that handle confidential documents. This tutorial will guide you through the process of sending an API call to the Redact PDF endpoint using PHP.

In a real-world scenario, a legal firm might need to redact sensitive client information such as email addresses, phone numbers, or specific keywords from legal documents before sharing them with external parties. Using the Redact PDF API, the firm can automate this process, ensuring that all sensitive data is consistently and accurately redacted, saving time and reducing the risk of human error.

Redact PDF Text with PHP Code Example

require 'vendor/autoload.php'; // Require the autoload file to load Guzzle HTTP client.

use GuzzleHttp\Client; // Import the Guzzle HTTP client namespace.
use GuzzleHttp\Psr7\Request; // Import the PSR-7 Request class.
use GuzzleHttp\Psr7\Utils; // Import the PSR-7 Utils class for working with streams.

$client = new Client(); // Create a new instance of the Guzzle HTTP client.

$headers = [
  'Api-Key' =--> 'xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' // Set the API key in the headers for authentication.
];

$options = [
  'multipart' => [
    [
      'name' => 'file', // Specify the field name for the file.
      'contents' => Utils::tryFopen('/path/to/file', 'r'), // Open the file specified by the '/path/to/file' for reading.
      'filename' => '/path/to/file', // Set the filename for the file to be processed, in this case, '/path/to/file'.
      'headers' => [
        'Content-Type' => '' // Set the Content-Type header for the file.
      ]
    ],
    [
      'name' => 'redactions', // Specify the field name for the text options.
      'contents' => '[{"type":"preset","value":"email"},{"type":"regex","value":"(\\\\+\\\\d{1,2}\\\\s)?\\\\(?\\\\d{3}\\\\)?[\\\\s.-]\\\\d{3}[\\\\s.-]\\\\d{4}"},{"type":"literal","value":"word"}]' // Set the value for the redactions option. This is a JSON-formatted string consisting of an array with sets of text options.
    ],
    [
      'name' => 'output', // Specify the field name for the output option.
      'contents' => 'pdfrest_pdf_with_redacted_text' // Set the value for the output option (in this case, 'pdfrest_pdf_with_redacted_text').
    ]
  ]
];

$request = new Request('POST', 'https://api.pdfrest.com/pdf-with-redacted-text-preview', $headers); // Create a new HTTP POST request with the API endpoint and headers.

$res = $client->sendAsync($request, $options)->wait(); // Send the asynchronous request and wait for the response.

echo $res->getBody(); // Output the response body, which contains the PDF with redaction preview annotations.

Source: GitHub

Breaking Down the Code

The code begins by including the autoload file to use the Guzzle HTTP client:

require 'vendor/autoload.php';

This is followed by importing necessary classes from Guzzle:

use GuzzleHttp\Client;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Psr7\Utils;

A new Guzzle client is instantiated:

$client = new Client();

The API key is set in the headers for authentication purposes:

$headers = [
  'Api-Key' => 'xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
];

The options array is configured for a multipart request. It includes the file to be redacted, the redaction rules, and the desired output format:

$options = [
  'multipart' => [
    [
      'name' => 'file',
      'contents' => Utils::tryFopen('/path/to/file', 'r'),
      'filename' => '/path/to/file',
      'headers' => [
        'Content-Type' => ''
      ]
    ],
    [
      'name' => 'redactions',
      'contents' => '[{"type":"preset","value":"email"},{"type":"regex","value":"(\\\\+\\\\d{1,2}\\\\s)?\\\\(?\\\\d{3}\\\\)?[\\\\s.-]\\\\d{3}[\\\\s.-]\\\\d{4}"},{"type":"literal","value":"word"}]'
    ],
    [
      'name' => 'output',
      'contents' => 'pdfrest_pdf_with_redacted_text'
    ]
  ]
];

The request is created and sent asynchronously, with the response being outputted:

$request = new Request('POST', 'https://api.pdfrest.com/pdf-with-redacted-text-preview', $headers);
$res = $client->sendAsync($request, $options)->wait();
echo $res->getBody();

Beyond the Tutorial

In this tutorial, you learned how to send an API call to the pdfRest Redact PDF endpoint using PHP. This allows you to automate the redaction of sensitive information in PDF documents. To explore more, you can try out all the pdfRest API Tools in the API Lab. For more detailed information, refer to the API Reference Guide.

Note: This example demonstrates a multipart API call. Code samples using JSON payloads can be found here.

Generate a self-service API Key now!
Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.