Artificial Intelligence

Tray IDP

Intelligent Document Processing (IDP) extracts structured data from unstructured documents and integrates it directly into your business systems through Tray's automation platform.

Tray Intelligent Document Processing (IDP) extracts structured data from unstructured documents including PDFs, images (JPEG, PNG), and multi-page files (TIFF). IDP integrates extracted data directly into your business systems through Tray's 600+ connectors, reducing manual data entry and accelerating workflows such as invoice processing, contract analysis, claims management, and employee onboarding.

Technical Documentation

For detailed connector operations and API reference, see the Merlin IDP Connector documentation.

When to Use Tray IDP

Use Tray IDP to automate document-heavy processes where:

Documents arrive in multiple formats - PDFs, images, scanned documents, multi-page files
Data must flow into downstream systems - ERP, CRM, databases, storage systems
Manual data entry creates bottlenecks - Time-consuming transcription and errors
Document volumes justify automation - Processing dozens to thousands of documents per month
Structured data extraction is needed - Invoices, forms, contracts, receipts

Common Use Cases

Finance & Accounting

Invoice processing and AP automation - Extract vendor details, line items, totals, and payment terms from invoices for automated entry into accounting systems
Expense report management - Parse receipts and expense forms to streamline reimbursement workflows
Purchase order reconciliation - Match purchase orders with invoices and delivery documents
Financial statement analysis - Extract key figures from statements and reports for consolidation

Legal & Compliance

Contract data extraction and management - Pull key terms, dates, obligations, and parties from contracts
Regulatory compliance documentation - Extract required data points from compliance filings and reports
Legal discovery support - Process and categorize large volumes of documents for e-discovery

Human Resources

Resume parsing and candidate screening - Extract education, experience, skills, and contact information from resumes
Employee onboarding documentation - Process ID documents, tax forms, and employment agreements
Benefits administration - Extract data from benefits enrollment forms and insurance documents

Healthcare

Patient intake forms - Digitize patient information from registration and medical history forms
Insurance claims processing - Extract diagnosis codes, treatment details, and billing information
Medical records digitization - Convert paper records into structured, searchable data

Supply Chain & Logistics

Bill of lading extraction - Pull shipment details, weights, and routing information
Customs documentation - Extract required fields from customs forms and declarations
Shipping manifests - Process cargo lists and tracking information

Insurance

Policy application processing - Extract applicant information and coverage details
Claims adjudication - Parse claims forms, supporting documentation, and medical records
Underwriting document analysis - Extract risk factors and financial information from applications

How It Works

Tray IDP uses natural language queries to extract specific information from documents. The extraction process follows these steps:

Document Input - Connect to document sources (email attachments, cloud storage, API uploads, webhooks)
Define Queries - Specify what data to extract using plain English questions
Process Documents - IDP analyzes document structure, layout, and content to locate requested information
Integrate Data - Map extracted data to destination systems via any of Tray's 600+ connectors

Example: Invoice Processing

Here's how IDP processes an invoice and sends data to NetSuite:

Query examples for invoice extraction:

"What is the invoice number?"
"What is the vendor name?"
"What is the invoice date?"
"What is the invoice total?"
"What is the due date?"
"Extract all line items with description, quantity, unit price, and total"

Extracted data structure:

{
  "invoice_number": "INV-2024-001",
  "vendor_name": "Acme Supplies Inc.",
  "invoice_date": "2024-03-15",
  "invoice_total": "$4,850.00",
  "due_date": "2024-04-15",
  "line_items": [
    {
      "description": "Office Supplies - Premium Pack",
      "quantity": 10,
      "unit_price": "$250.00",
      "total": "$2,500.00"
    },
    {
      "description": "Printer Toner Cartridges",
      "quantity": 5,
      "unit_price": "$470.00",
      "total": "$2,350.00"
    }
  ]
}

This structured data then flows directly into NetSuite for bill creation, eliminating manual data entry.

Supported Formats and Capabilities

File Format Support

PDF - Single or multi-page documents (up to 20 pages)
JPEG - High-resolution images (minimum 300 DPI recommended)
PNG - Images with transparency support
TIFF - Multi-page scanned documents

Input Methods

File URL - Process documents hosted on web servers or cloud storage (required)
Email attachments - Extract attachment URLs from incoming emails
Cloud storage integrations - Connect to Google Drive, Dropbox, OneDrive, or S3 to access document URLs
API integrations - Receive document URLs from any connected system

Document Types Supported

Invoices and bills
Purchase orders
Receipts and expense reports
Contracts and agreements
Forms and applications
Tax documents
Medical records and claims
Shipping documents
Identity documents
Resumes and CVs

Key Capabilities

Natural language queries - Extract data using plain English questions without complex templates
Multi-field extraction - Extract multiple data points in a single operation
Table and line item extraction - Parse structured tables with multiple rows
Context awareness - Understands document layout, headers, footers, and structure
Multi-page processing - Handle complex documents up to 20 pages
600+ integrations - Direct connection to business systems without custom development

Getting Started

Prerequisites

Before using Tray IDP, ensure you have:

Active Tray.ai account with IDP access enabled (contact your Customer Success representative if not enabled)
Document source configured (email inbox, cloud storage connection, API endpoint)
Destination system connection set up (optional but recommended for automation)

Quick Start Guide

Add the Merlin IDP connector to your Tray workflow from the connector library
Configure document input with the following parameters:
- File name - Name of the document for tracking and logging
- File URL - Web address where the document is accessible (required)
- MIME type - Specify the file format:
  - application/pdf for PDF files
  - image/jpeg for JPEG images
  - image/png for PNG images
  - image/tif for TIFF files
- File expire - URL expiration timestamp for hosted files
Define extraction queries as a list of natural language questions
- Be specific about what information you need
- Use terminology that appears in the documents
- For tables, specify all columns you want to extract
Map extracted data to your destination system using Tray's data mapping tools
Add validation and error handling to ensure data quality and handle exceptions

Sample Workflow Templates

Ready-to-use templates for common IDP scenarios:

Invoice to NetSuite - Automatically process invoices and create bills in NetSuite
Contract analysis to Salesforce - Extract contract terms and update Salesforce opportunities
Resume parsing to ATS - Parse resumes and create candidate records in applicant tracking systems
Expense report to QuickBooks - Process employee receipts and create expense entries

Full Documentation

For step-by-step connector configuration, see the Merlin IDP Connector operations guide.

Best Practices

Document Quality Recommendations

For optimal extraction accuracy:

Use high-resolution images (minimum 300 DPI, 600 DPI recommended for scanned documents)
Ensure text is legible and not obscured by stamps, signatures, or annotations
Avoid skewed or rotated pages - straighten scanned documents before processing
Remove unnecessary pages - blank pages or cover sheets that don't contain data
Use color or grayscale rather than black and white for better text recognition

Query Design Tips

Writing effective extraction queries:

Be specific in your questions
- Good: "What is the total amount due?"
- Avoid: "What is the amount?"
Use terminology that matches the document
- If the document says "Invoice Number", ask "What is the invoice number?"
- Adapt queries to different document formats or regional variations
For tables, specify all required columns
- Example: "Extract all line items with item description, quantity, unit price, and line total"
Request dates in a consistent format
- Example: "What is the invoice date in YYYY-MM-DD format?"
Ask for full contact information
- Example: "What is the vendor's complete name, address, and tax ID?"

Workflow Design Patterns

Building robust IDP workflows:

Error Handling

Add conditional logic to check for missing or unclear data
Set up fallback paths for documents that fail processing
Implement retry logic with exponential backoff for transient failures

Validation Checks

Cross-check extracted values against expected formats (dates, amounts, IDs)
Validate totals by summing line items and comparing to extracted total
Flag mismatches for human review

Human Review Workflows

Route documents to review queues when:
- Required fields are missing
- Extracted values seem unreasonable (negative amounts, future dates)
- Document quality is poor
Use approval connectors (Slack, Microsoft Teams, email) for review requests

Batch Processing

Process multiple documents in loops for efficiency
Add delays between documents to respect rate limits
Group similar documents together for consistent processing

Logging and Monitoring

Log all extraction attempts with document IDs
Track processing times and success rates
Monitor for patterns in failures to improve queries

Integration with Other AI Tools

Enhance IDP workflows by combining with other Tray AI capabilities:

Merlin Guardian - Data masking and PII protection

Mask sensitive information (SSN, credit card numbers, personal data) before storing or sharing
Ensure compliance with GDPR, HIPAA, PCI-DSS regulations
Example: Extract patient data from medical forms, then mask PII before sending to analytics systems

Merlin Text Analysis - Sentiment and classification

Analyze sentiment of contract clauses or customer feedback in documents
Classify documents by type, urgency, or department
Example: Extract text from support tickets, analyze sentiment, and route to appropriate teams

Merlin Generate Text - Summarization and content generation

Generate executive summaries of extracted contract terms
Create email notifications with document highlights
Example: Extract invoice details, then generate approval request email with key information

Integration Pattern Example:

Combine multiple AI connectors in sequence for sophisticated document workflows: Document → Merlin IDP (extract) → Merlin Guardian (mask PII) → Merlin Text Analysis (classify) → Route to appropriate system based on classification.

Limitations and Considerations

Processing Limits

Document Size Constraints

Pages per execution: 20 pages maximum per document
Monthly quota: 1,000 pages per month (default allocation)
File size: Recommended maximum 10MB per file (varies by format)
Documents per workflow: No hard limit, but consider rate limits

For higher volumes: Contact your Customer Success representative to discuss increasing limits based on your needs.

Document Constraints

Factors That May Reduce Accuracy:

Small text or low resolution - Text smaller than 8pt or images below 200 DPI
Poor scan quality - Faded, blurry, or low-contrast documents
Handwritten text - Limited support for handwriting recognition
Complex multi-column layouts - Newspapers or academic papers with intricate formatting
Heavy redactions or annotations - Stamps, signatures, or highlights that obscure text
Non-standard fonts - Decorative or stylized fonts may reduce accuracy
Mixed languages - Documents with multiple languages in a single page

Processing Time

Typical extraction times by document complexity:

Simple documents (1-2 pages, straightforward layout): 10-20 seconds
Standard documents (3-5 pages, tables and forms): 20-40 seconds
Complex documents (10-20 pages, multiple tables): 30-60 seconds

Processing time varies based on:

Number of pages
Document complexity
Number of queries
System load

For time-sensitive workflows, account for processing time in your automation design and consider parallel processing for batch operations.

Considerations

Extraction Accuracy

Factors Affecting Accuracy:

Document quality - Higher quality source documents yield better results
Query specificity - Well-defined questions produce more accurate extractions
Document standardization - Consistent formats from the same source improve reliability
Field complexity - Simple fields (dates, numbers) extract more reliably than complex free-text

Improving Accuracy:

Test with representative sample documents before production deployment
Refine queries based on test results
Standardize document sources when possible
Implement validation checks in workflows

Review and Validation Workflows

For business-critical data, implement human review:

Extract data with IDP
Apply validation rules (format checks, range validation, required fields)
Flag for review when:
- Required fields are missing
- Values fall outside expected ranges
- Document quality is questionable
Route to approval using Slack, Teams, or email connectors
Collect corrections and update destination systems

Data Security and Compliance

Security Measures:

All document processing occurs in SOC 2 Type II compliant environments
Data is encrypted in transit and at rest using industry-standard protocols
Documents are not retained after processing completes
Processing occurs in regional data centers based on your Tray instance location

Compliance Considerations:

GDPR: IDP processes documents within your specified region and does not store personal data
HIPAA: Suitable for processing healthcare documents when combined with proper workflow design and BAA
PCI-DSS: Can process payment-related documents; combine with Merlin Guardian for card number masking

For specific compliance requirements, consult with your Tray Customer Success team.

Testing Best Practices

Before Production Deployment:

Collect test documents - Gather 20-30 representative samples covering variations you expect
Define success criteria - Set accuracy targets (e.g., 95% field accuracy)
Test extraction queries - Iterate on query wording based on results
Validate with real data - Process actual documents in a test environment
Measure performance - Track processing time and success rates
Document edge cases - Identify document types or conditions that require special handling

Advanced Patterns

Pricing and Access

Tray IDP is available on specific pricing plans. Default allocation is 1,000 pages per month with options to increase based on your processing needs.

To get started with IDP:

Contact your Customer Success representative to enable IDP in your workspace
Request higher processing limits for increased document volumes
Discuss enterprise requirements for custom implementations and dedicated support