Data Policy

Zero retention.
Zero exposure.

Your documents are processed in-memory and never stored. We don't retain uploaded files or extracted data after your API response is delivered.

Effective March 2026 · No legalese, no surprises.

The short version: Upload a PDF, get structured JSON back, done. Nothing you send us persists after your response is delivered. No database rows containing your document content. No log lines containing extracted text. No file on disk. Zero retention = zero breach liability.

📄

Uploaded PDFs

Never stored

When you upload a file to /api/v1/parse, it lands in a memory buffer — never written to disk, never saved to object storage. The buffer is garbage-collected immediately after your response is sent.

This is enforced at the infrastructure level: our file upload handler uses in-memory storage exclusively. There is no path where a PDF ends up on a filesystem or in cloud storage.

🔑

Extracted JSON

Not persisted

The structured data we extract from your document — field values, tables, raw text — is returned in your API response and immediately discarded server-side.

We log only anonymized metadata about each request: document type, page count, file size in bytes, processing time, and a SHA-256 hash of the file (for deduplication). No field values, no table content, no raw text from your document is ever written to our database.

🧪

Demo uploads

Same policy

The interactive demo at /demo uses the same /api/v1/parse endpoint under the hood. Documents you upload in the demo are processed in-memory and discarded exactly the same way as API calls.

📋

Application logs

Metadata only

Our server logs record only the minimum needed to operate and debug the service. Document content is explicitly excluded.

What we log	Contains document content?
Timestamp	No
Document type (invoice, receipt, etc.)	No
Page count	No
File size (bytes)	No
Processing time (ms)	No
SHA-256 hash of file	No — hash is one-way, not reversible
API key hash (for auth audit)	No
Extracted field values	Blocked
Raw document text	Blocked
File contents / buffers	Blocked

Third-party AI processing

ShapeForge uses OpenAI's GPT-4o to extract structured data from document text. Your document text is sent to OpenAI's API for this purpose and is governed by OpenAI's API data usage policy, which prohibits training on API inputs by default.

No other third parties receive your document content.

Why zero retention?

Enterprise customers processing invoices, contracts, and financial documents need a simple guarantee: if we don't store it, we can't breach it.

Zero retention eliminates GDPR data residency concerns, SOC 2 data-at-rest requirements for document content, and breach notification obligations for extracted data. You parse, you get results, we forget.

Questions?

Email hello@shapeforge.dev. We'll respond within one business day.

Zero retention.Zero exposure.

Uploaded PDFs

Extracted JSON

Demo uploads

Application logs

Third-party AI processing

Why zero retention?

Questions?

Zero retention.
Zero exposure.