How to Convert PDF to Markdown with JavaScript in NodeJS
Why Convert PDF to Markdown with JavaScript?
The pdfRest PDF to Markdown API Tool provides a seamless way to convert PDF documents into Markdown format using JavaScript. This tutorial will guide you through sending an API call to the PDF to Markdown endpoint using JavaScript. By following the steps outlined, you'll be able to automate the conversion process, enabling easy integration into your applications or workflows.
Imagine a scenario where you have a repository of PDF documents that need to be converted into a format suitable for web publishing or collaborative editing. Markdown is a lightweight markup language that is widely used for creating formatted text using a plain-text editor. By converting PDFs into Markdown, you can easily integrate these documents into web pages or collaborative platforms like GitHub, making them accessible and editable by a broader audience.
PDF to Markdown with JavaScript Code Example
// This request demonstrates how to generate markdown from a PDF document. var axios = require("axios"); var FormData = require("form-data"); var fs = require("fs"); // Create a new form data instance and append the PDF file and parameters to it var data = new FormData(); data.append("file", fs.createReadStream("/path/to/file")); data.append("page_break_comments", "on"); // define configuration options for axios request var config = { method: "post", maxBodyLength: Infinity, // set maximum length of the request body url: "https://api.pdfrest.com/markdown", headers: { "Api-Key": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", // Replace with your API key ...data.getHeaders(), // set headers for the request }, data: data, // set the data to be sent with the request }; // send request and handle response or error axios(config) .then(function (response) { console.log(JSON.stringify(response.data)); }) .catch(function (error) { console.log(error); });
Source: GitHub Repository
Breaking Down the Code
The code begins by importing necessary modules: axios
for making HTTP requests, form-data
for handling form submissions, and fs
for file system operations.
var data = new FormData(); data.append("file", fs.createReadStream("/path/to/file")); data.append("page_break_comments", "on");
Here, a new FormData
instance is created to manage the data being sent to the API. The PDF file is appended using fs.createReadStream()
, which reads the file from the specified path. The page_break_comments
parameter is set to "on", which means that comments indicating page breaks will be included in the Markdown output.
var config = { method: "post", maxBodyLength: Infinity, url: "https://api.pdfrest.com/markdown", headers: { "Api-Key": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", ...data.getHeaders(), }, data: data, };
The configuration for the axios
request is defined here. The request method is set to "post", and the URL is the endpoint for the PDF to Markdown API. The maxBodyLength
is set to Infinity
to accommodate large files. The headers include the API key, which must be replaced with your actual key, and headers from the FormData
instance.
axios(config) .then(function (response) { console.log(JSON.stringify(response.data)); }) .catch(function (error) { console.log(error); });
The axios
request is executed, and the response is handled. If successful, the response data is logged to the console. If an error occurs, it is caught and logged.
Beyond the Tutorial
In this tutorial, you learned how to convert a PDF document to Markdown using JavaScript and the pdfRest API. This process allows you to automate and integrate document conversions into your applications. To explore more, try out all the pdfRest API Tools in the API Lab. For detailed information on each API, refer to the API Reference Guide.
Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at GitHub Repository.