Extract Text From File Uploads

Automations & Integrations, Workflows

Updated June 2, 2026

Who can use this feature

Related subproduct Integrations, Workflows

Available on All plans

Extract plain text from files uploaded to File Upload form fields — PDFs, Word documents, spreadsheets, images, and scanned documents. The extracted text becomes a variable you can pass directly into AI Tasks for processing, summarization, or data extraction. Uploaded files with extracted text are also searchable in the global search box.

Users: Only Administrators or a Builder with ‘edit’ permission can enable text extraction on form fields.

Enable text extraction

Open the workflow in the editor and add (or select) a File Upload form field.
Click the meatball menu (three dots) next to the File Upload form field and select Settings.
Switch the Extract text toggle to the “on” position.
Click Apply.

Toggle on text extraction in File Upload field settings

When a file is uploaded during a workflow run, the system extracts up to 512KB of plain text from it automatically.

OCR for images and scanned PDFs

OCR (optical character recognition) automatically extracts text from images and scanned documents. No additional setup is required — OCR activates automatically when text extraction is enabled.

OCR works with:

Images — PNG, JPG, and other image formats uploaded to a File Upload form field.
Image-only PDFs — Scanned documents that contain images but no selectable text.

Note: If a PDF contains both text and images, standard text extraction is used instead of OCR. This avoids unwanted noise from attempting to recognize characters in images alongside existing text.

OCR-extracted text works the same way as standard extracted text — it populates the {{form.Field_name.text_content}} variable and makes uploaded files searchable in the global search box.

Note: OCR currently supports English only. Contact support to request additional language support.

Use extracted text as a variable

Once text extraction is enabled, the extracted content is available as a variable anywhere you see the magic wand icon in the workflow editor.

The variable follows this pattern:

{{form.Field_name.text_content}}

For example, if your File Upload field is called “Employee Resume”, the variable is {{form.Employee_resume.text_content}}.

Search uploaded file content

Files with extracted text — including OCR-processed images and scanned PDFs — are indexed and searchable in the global search box. Type a keyword from the file’s content into Search to find matching uploads.

This is especially useful for locating scanned documents or images by the text they contain, without needing to open each file individually.

Use extracted text in AI Tasks

Pass the extracted text variable into an AI Task to process, analyze, or transform the content. Common use cases include:

Extracting structured fields — Pull specific data points (names, dates, amounts) from completed forms or contracts.
Parsing tabular data — Extract rows and columns from uploaded spreadsheets or PDF tables.
Summarizing documents — Generate concise summaries of lengthy reports or policy documents.
Analyzing unstructured text — Identify themes, sentiment, or key takeaways from notes and feedback.

To set this up, add an AI Task after the File Upload task in your workflow. In the AI Task’s prompt, insert the {{form.Field_name.text_content}} variable and describe what you want the AI to do with the text.

Here are some AI Task types that work well with extracted text:

FAQ

What file types are supported?

PDFs, Word documents, and text-based files like CSV, TXT, TSV, and ODS. Image types (PNG, JPG) are also supported — OCR extracts text from images and image-only PDFs automatically.

Is there a file size limit?

The uploaded file can be any size up to the standard file upload limits, but only the first 512KB of extracted text is available — roughly 100,000 words.

What happens if the file contains no extractable text?

If the file has no recognizable text content even after OCR processing, the variable returns empty.

What languages does OCR support?

OCR currently supports English. Contact support if you need additional language support.

Does OCR work on PDFs that contain both text and images?

No. If a PDF contains selectable text alongside images, standard text extraction is used. OCR only activates for image-only PDFs and image files. This prevents character-recognition noise from interfering with the existing text.

Learn more about File Upload form fields and AI Tasks.