How to Extract Images from PDF Files in .NET with C#, Tutorial

Share this page

Why Extract PDF Images with C#?

The pdfRest Extract Images API Tool is a powerful resource that allows developers to extract images from PDF files programmatically. This tutorial will guide you through the process of sending an API call to extract images using C#. By integrating this API into your C# application, you can automate the extraction of images from PDFs, making it easier to handle large volumes of documents or integrate this functionality into a larger document processing workflow.

Extracting images from PDFs can be incredibly useful. For example, a company might receive numerous PDF reports containing graphs and charts that need to be analyzed separately. By using the Extract Images API, they can automate the extraction of these visual elements, allowing for more efficient data processing and analysis. This can save time and reduce the potential for errors compared to manual extraction.

Extract PDF Images with C# Code Example

using System.Text;

using (var httpClient = new HttpClient { BaseAddress = new Uri("https://api.pdfrest.com") })
{
    using (var request = new HttpRequestMessage(HttpMethod.Post, "extracted-images"))
    {
        request.Headers.TryAddWithoutValidation("Api-Key", "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");
        request.Headers.Accept.Add(new("application/json"));
        var multipartContent = new MultipartFormDataContent();

        var byteArray = File.ReadAllBytes("/path/to/file");
        var byteAryContent = new ByteArrayContent(byteArray);
        multipartContent.Add(byteAryContent, "file", "file_name");
        byteAryContent.Headers.TryAddWithoutValidation("Content-Type", "application/pdf");

        var byteArrayOption = new ByteArrayContent(Encoding.UTF8.GetBytes("1-last"));
        multipartContent.Add(byteArrayOption, "pages");

        request.Content = multipartContent;
        var response = await httpClient.SendAsync(request);

        var apiResult = await response.Content.ReadAsStringAsync();

        Console.WriteLine("API response received.");
        Console.WriteLine(apiResult);
    }
}

Source: GitHub Repository

Breaking Down the Code

The code begins by setting up an HttpClient with the base address of the pdfRest API:

using (var httpClient = new HttpClient { BaseAddress = new Uri("https://api.pdfrest.com") })

Next, an HttpRequestMessage is created to send a POST request to the "extracted-images" endpoint:

using (var request = new HttpRequestMessage(HttpMethod.Post, "extracted-images"))

The API key is added to the request headers for authentication:

request.Headers.TryAddWithoutValidation("Api-Key", "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");

The request is set to accept JSON responses:

request.Headers.Accept.Add(new("application/json"));

A MultipartFormDataContent is prepared to hold the PDF file and additional options. The file is read into a byte array and added to the content:

var byteArray = File.ReadAllBytes("/path/to/file");
var byteAryContent = new ByteArrayContent(byteArray);
multipartContent.Add(byteAryContent, "file", "file_name");
byteAryContent.Headers.TryAddWithoutValidation("Content-Type", "application/pdf");

The "pages" parameter specifies which pages to extract images from, in this case, all pages from 1 to the last:

var byteArrayOption = new ByteArrayContent(Encoding.UTF8.GetBytes("1-last"));
multipartContent.Add(byteArrayOption, "pages");

The request content is set, and the request is sent asynchronously. The response is read and printed to the console:

var response = await httpClient.SendAsync(request);
var apiResult = await response.Content.ReadAsStringAsync();
Console.WriteLine("API response received.");
Console.WriteLine(apiResult);

Beyond the Tutorial

In this tutorial, you learned how to use C# to send a request to the pdfRest Extract Images API and handle the response. This allows you to automate the process of extracting images from PDF documents efficiently. To explore more functionalities, try out all the pdfRest API Tools in the API Lab. For detailed information on each API, refer to the API Reference Guide.

Note: This example demonstrates a multipart API call. For code samples using JSON payloads, visit this GitHub repository.