Skip to main content
POST
/
v1
/
documents
/
{document_id}
/
extraction
curl --request POST \
  --url 'https://api.pardocs.com/v1/documents/{document_id}/extraction?include_coordinates=true' \
  --header 'x-api-key: <api-key>' \
  --data '{
  "template_ids": [
    "<string>"
  ]
}'
[
  {
    "document_type": "<string>",
    "pages": [
      123
    ],
    "extracted": {}
  }
]

Documentation Index

Fetch the complete documentation index at: https://docs.pardocs.com/llms.txt

Use this file to discover all available pages before exploring further.

The Split and Extract a Document endpoint allows you to process a document by splitting it according to specified document types and performing extraction on each split section. This is useful for handling complex documents where different parts of the document need to be analyzed or extracted separately based on defined template schemas. Query Parameters:
  • include_coordinates (boolean, optional): When true, the response includes the source location for each extracted value (page number and bounding box). Use this when you need to highlight or reference the exact region in the document where a value was found. Default is false.
Request Body Parameters:
  • template_ids (array of strings): A list of template_ids that dictate how the document should be split and extracted. Each ID corresponds to a specific template schema that defines the rules for processing each section of the document.
curl --request POST \
  --url 'https://api.pardocs.com/v1/documents/{document_id}/extraction?include_coordinates=true' \
  --header 'x-api-key: <api-key>' \
  --data '{
  "template_ids": [
    "<string>"
  ]
}'
Response when include_coordinates=true When you pass include_coordinates=true, each extracted field in extracted is an object with the following shape instead of a plain string or number:
  • value: The extracted text or number (same as the non-coordinate response).
  • page: The 1-based page number where this value was found.
  • bounding_box_coordinates: Normalized bounding box [x1, y1, x2, y2] in the range 0–1 (relative to page width/height), so you can map the value back to a rectangle on the document for highlighting or overlay.
This lets you show users exactly where each piece of data came from in the original document (e.g., for IR reports or table-heavy PDFs). When include_coordinates is false or omitted, extracted fields remain simple key–value pairs (e.g. "analyst": "최보영").

Authorizations

x-api-key
string
header
required

Path Parameters

document_id
string
required

The ID of document to extract.

Query Parameters

include_coordinates
boolean
default:false

When true, each extracted field in the response includes the source location: value, page number, and normalized bounding_box_coordinates [x1, y1, x2, y2] (0–1) so you can highlight or reference the exact region in the document.

Body

application/json
template_ids
string[]

Include all template_id to extract the document.

Response

Successful Response. When include_coordinates=true, each value in extracted is an object with value, page, and bounding_box_coordinates (normalized [x1,y1,x2,y2]); otherwise values are plain strings/numbers.

document_type
string
pages
integer[]
extracted
object