Skip to main content

Receipt Processing Pipeline

The receipt pipeline converts uploaded images into structured transaction data using OCR and AI.

Pipeline Overview

Processing Details

OCR (EasyOCR)

  • Server-side text extraction using the EasyOCR library
  • Configured for English and multilingual (Tagalog/English) text
  • Handles common receipt issues: rotated text, low contrast, thermal paper

AI Parsing (GPT-4o Vision)

The parser sends both the raw OCR text and the original image to GPT-4o using vision capabilities. The system prompt includes:

  • Philippine receipt format templates (BIR Official Receipts, GCash/Maya screenshots)
  • Known merchant patterns
  • Expected output structure via function calling

Categorization

The categorizer matches merchant names against known Philippine merchants and assigns categories:

  • SM Supermarket -> Food & Groceries
  • Jollibee -> Food & Dining
  • Meralco -> Utilities
  • Falls back to AI-suggested category if no merchant match

Confirmation

When the user confirms parsed data, the server creates:

  1. A Transaction record with the extracted amount, merchant, date, and category
  2. TransactionLineItem records for each parsed line item
  3. Links the Receipt to the new Transaction

Philippine-Specific Handling

  • GCash/Maya screenshots: Template-based extraction using known UI layouts
  • Sari-sari store receipts: Handwritten text recognition with lower confidence threshold
  • BIR Official Receipts: Extract TIN, OR number, VAT breakdown
  • Mixed Tagalog/English: OCR model configured for multilingual extraction

Rate Limits

  • Maximum upload size: 10MB per image
  • Accepted formats: JPEG, PNG, HEIC
  • Images are compressed on the mobile client before upload