Extract from File or URL

The Extract from File or URL endpoint runs extraction on a document you send in the same request—either as an uploaded file or via a URL. No document is stored in ParDocs; you get OCR, markdown, and extracted results back immediately. Use this when you want to process a document once without creating a document record. Query Parameters:

include_coordinates (boolean, optional): When true, each extracted field in result includes the source location: value, page, and bounding_box_coordinates (normalized [x1, y1, x2, y2]). Default is false.
force_exclusive_template (boolean, optional): When true, the first template’s document type is used for the entire document instead of auto-splitting by type. Default is false.
document_layout_analysis (boolean, optional): When true, layout analysis is run before reading (useful for complex or table-heavy documents). Default is false.

Request Body (multipart/form-data):

file (binary, optional): The document file to extract from. Provide either file or url, not both.
url (string, optional): The URL of the document to extract from. Provide either file or url, not both.
template_ids (array of strings, required): List of template_ids used for splitting and extraction.

curl --request POST \
  --url 'https://api.pardocs.com/v1/documents/extract?include_coordinates=true' \
  --header 'x-api-key: <api-key>' \
  --form 'file=@/path/to/document.pdf' \
  --form 'template_ids=["gAWADq4aB"]'

Response The response is a single object with:

ocr: Array of OCR output per page.
markdown: Full document text as markdown.
result: Array of split sections. Each item has document_type, pages, and properties (the extracted key–value pairs for that section).

When include_coordinates=true, each value in properties is an object with value, page, and bounding_box_coordinates (normalized [x1, y1, x2, y2]) instead of a plain string or number, so you can map each field back to a region on the document.

Authorizations

x-api-key

string

header

required

Query Parameters

include_coordinates

boolean

default:false

When true, each extracted field in result includes value, page, and normalized bounding_box_coordinates [x1, y1, x2, y2] (0–1) for the source region in the document.

force_exclusive_template

boolean

default:false

When true, force the first template's document_type for the whole document instead of auto-splitting by type.

document_layout_analysis

boolean

default:false

When true, run layout analysis before reading (e.g. for complex or table-heavy documents).

Body

multipart/form-data

template_ids

string[]

required

List of template_id values to use for splitting and extraction.

file

Document file to extract from. Provide either file or url, not both.

url

string<uri>

URL of the document to extract from. Provide either file or url, not both.

Response

Successful Response. Returns ocr, markdown, and result (array of { document_type, pages, properties }). When include_coordinates=true, each property value is an object with value, page, and bounding_box_coordinates.

ocr

array

OCR output per page from the document.

markdown

string

Converted markdown text of the full document.

result

object[]

Show child attributes

Get started

Templates API

Documents API

Extract from File or URL

Authorizations

Query Parameters

Body

Response