Receipt Processing Pipeline
The receipt pipeline converts uploaded images into structured transaction data using OCR and AI.
Pipeline Overview
Processing Details
OCR (EasyOCR)
- Server-side text extraction using the EasyOCR library
- Configured for English and multilingual (Tagalog/English) text
- Handles common receipt issues: rotated text, low contrast, thermal paper
AI Parsing (GPT-4o Vision)
The parser sends both the raw OCR text and the original image to GPT-4o using vision capabilities. The system prompt includes:
- Philippine receipt format templates (BIR Official Receipts, GCash/Maya screenshots)
- Known merchant patterns
- Expected output structure via function calling
Categorization
The categorizer matches merchant names against known Philippine merchants and assigns categories:
SM Supermarket-> Food & GroceriesJollibee-> Food & DiningMeralco-> Utilities- Falls back to AI-suggested category if no merchant match
Confirmation
When the user confirms parsed data, the server creates:
- A
Transactionrecord with the extracted amount, merchant, date, and category TransactionLineItemrecords for each parsed line item- Links the
Receiptto the newTransaction
Philippine-Specific Handling
- GCash/Maya screenshots: Template-based extraction using known UI layouts
- Sari-sari store receipts: Handwritten text recognition with lower confidence threshold
- BIR Official Receipts: Extract TIN, OR number, VAT breakdown
- Mixed Tagalog/English: OCR model configured for multilingual extraction
Rate Limits
- Maximum upload size: 10MB per image
- Accepted formats: JPEG, PNG, HEIC
- Images are compressed on the mobile client before upload