How to Validate PDF/A Conformance with PHP
Why Validate PDF/A with PHP?
The pdfRest Query PDF API Tool is a powerful resource that allows developers to extract a wide range of information from PDF files. By using this tool, you can programmatically query properties such as metadata, page count, presence of annotations, and much more. This tutorial will guide you through the process of sending an API call to Query PDF using PHP to validate PDF/A conformance.
Businesses of all sizes can leverage PDF/A conformance validation to ensure the long-term accessibility and usability of their digital documents. This is especially important for organizations that rely on electronic archives, such as healthcare providers managing patient records or government agencies storing historical documents. Validating PDFs as PDF/A guarantees consistent rendering regardless of the software used to create them, eliminates compatibility issues when sharing documents externally, and allows for reliable search based on embedded metadata. By automating PDF/A validation, businesses can streamline document management workflows, minimize errors, and ensure their critical information remains accessible for years to come.
Validate PDF/A with PHP Code Example
require 'vendor/autoload.php'; use GuzzleHttp\Client; use GuzzleHttp\Psr7\Request; use GuzzleHttp\Psr7\Utils; $client = new Client(); $headers = [ 'Api-Key' => 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' ]; $options = [ 'multipart' => [ [ 'name' => 'file', 'contents' => Utils::tryFopen('/path/to/file', 'r'), 'filename' => '/path/to/file', 'headers' => [ 'Content-Type' => '' ] ], [ 'name' => 'queries', 'contents' => 'pdfa' ] ] ]; $request = new Request('POST', 'https://api.pdfrest.com/pdf-info', $headers); $res = $client->sendAsync($request, $options)->wait(); echo $res->getBody();
This code is sourced from the pdf-rest-api-samples repository on GitHub.
Breaking Down the Code
The provided PHP script uses the Guzzle HTTP client to make an API request. The require 'vendor/autoload.php';
line includes the necessary files for Guzzle to work. The Client
class is instantiated to create a new HTTP client. Headers are set with the required API key for authentication.
The $options
array defines the multipart request payload. The 'file' part includes the PDF file to be queried, with its contents, filename, and content type. The 'queries' part specifies the information to retrieve from the PDF. This can be a comma-separated list that represents different properties of the PDF to be queried, but in this case we are only running the "pdfa" check for PDF/A validation.
A new Request
object is created with the 'POST' method, the API endpoint URL, and the headers. The request is sent asynchronously using $client->sendAsync()
, and the script waits for the response with ->wait()
. Finally, the response body is echoed out, which contains the queried information from the PDF.
Beyond the Tutorial
By following the steps in this tutorial, you've learned how to set up and execute a multipart API call to pdfRest's Query PDF API endpoint using PHP. This allows you to programmatically access detailed information about PDF files, which can be crucial for various document processing tasks.
Feel free to demo all of the pdfRest API Tools in the API Lab and refer to the API Reference Guide for more details on how to use the pdfRest Cloud API.
Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at the pdf-rest-api-samples repository on GitHub.