How to Summarize PDF Text with PHP

Learn how to use PHP to summarize PDF text content with the pdfRest Summarize PDF API.
Share this page

Why Summarize PDF with PHP?

The pdfRest Summarize PDF API Tool is a powerful solution for developers looking to extract and condense text from PDF documents. This tutorial will guide you through the process of sending an API call to the Summarize PDF endpoint using PHP, allowing you to automate the summarization of PDF content efficiently.

Users might need to summarize lengthy PDF documents to quickly grasp the main ideas without reading the entire document. For instance, a researcher could use this tool to summarize academic papers or reports, saving time while still gaining valuable insights from the documents.

Summarize PDF with PHP Code Example

require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Psr7\Utils;

// By default, we use the US-based API service. This is the primary endpoint for global use.
$apiUrl = "https://api.pdfrest.com";

/* For GDPR compliance and enhanced performance for European users, you can switch to the EU-based service by uncommenting the URL below.
 * For more information visit https://pdfrest.com/pricing#how-do-eu-gdpr-api-calls-work
 */
//$apiUrl = "https://eu-api.pdfrest.com";

$client = new Client();

$headers = [
  'Api-Key' =--> 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
];

$options = [
  'multipart' => [
    [
      'name' => 'file',
      'contents' => Utils::tryFopen('/path/to/file', 'r'),
      'filename' => 'filename.pdf',
      'headers' => [
        'Content-Type' => 'application/pdf'
      ]
    ],
    [
      'name' => 'target_word_count',
      'contents' => '100'
    ]
  ]
];

$request = new Request('POST', $apiUrl.'/summarized-pdf-text', $headers);
$res = $client->sendAsync($request, $options)->wait();
echo $res->getBody();

Source: GitHub

Breaking Down the Code

The code begins by including the necessary libraries using Composer's autoload feature with require 'vendor/autoload.php';. It uses the Guzzle HTTP client to handle the HTTP requests.

The $apiUrl variable defines the endpoint for the API calls. By default, it uses the US-based service, but there's an option to switch to the EU-based service for GDPR compliance by uncommenting the alternative URL.

The $headers array includes the 'Api-Key', which is essential for authenticating requests. Replace 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with your actual API key.

The $options array sets up a multipart request. It contains two main parts: the PDF file and the target word count. The 'file' part uses Utils::tryFopen('/path/to/file', 'r') to read the PDF file, and you must specify the correct file path. The 'target_word_count' part specifies the desired word count for the summary.

A Request object is created with the method 'POST', the API endpoint, and the headers. The request is then sent asynchronously with $client->sendAsync($request, $options)->wait();, and the response body is printed using echo $res->getBody();.

Beyond the Tutorial

In this tutorial, you learned how to make a multipart API call to the pdfRest Summarize PDF endpoint using PHP. This example demonstrated how to configure the request and handle the response.

To explore more capabilities of pdfRest, visit the API Lab to demo all available API tools. For detailed information on each API, refer to the API Reference Guide.

Note: This example uses a multipart API call. For code samples using JSON payloads, visit GitHub.

Generate a self-service API Key now!
Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.