Overview
This guide will help you extract data from Invoices using Butler's OCR APIs in Python. In 15 minutes you'll be ready to add Python Invoice OCR into your product or workflow!
Before getting started, you'll want to make sure to do the following:
- Signup for a free Butler account at https://app.butlerlabs.ai
- Write down your Butler API key from the Settings menu. Follow the Getting Started guide for more details about how to do that.
Get your API ID
Sign into the Butler product, go to the Library and search for the Invoice model:
Click on the Invoice card, then press the Create button to create a new Invoice model:
Once on the model details page, go to the APIs tab:
Copy the API ID (also known as the Queue ID) and write it down. We'll use it in our code below.
Sample Python Invoice OCR Code
You can copy and paste the following Python sample code to process documents with OCR using the API.
# Ensure it's installed in your environment with pip install butler-sdk
from butler import Client
# Get API Key from https://docs.butlerlabs.ai/reference/uploading-documents-to-the-rest-api#get-your-api-key
api_key = '<api-key>'
# Get Queue ID from https://docs.butlerlabs.ai/reference/uploading-documents-to-the-rest-api#go-to-the-model-details-page
queue_id = '<queue_id>'
# Response is a strongly typed object
response = Client(api_key).extract_document(queue_id, 'test_invoice.pdf')
# Convert to a dictionary for printing
print(response.to_dict())
In-Product Sample Code
You can also copy the sample code directly from the product. This code will have your API ID and API Key already pre-populated for you!
Extracted Invoice Fields
Here is an example of what an Invoice JSON response looks like:
{
"documentId": "014a5a3f-b91e-439e-b301-66abbbfd69b6",
"documentStatus": "Completed",
"fileName": "test_invoice.png",
"mimeType": "image/png",
"documentType": "Invoice",
"confidenceScore": "High",
"formFields": [
{
"fieldName": "Invoice Number",
"value": "#9000000001",
"confidenceScore": "High"
},
{
"fieldName": "Invoice Date",
"value": "Dec 11, 2020,",
"confidenceScore": "High"
},
{
"fieldName": "Customer Name",
"value": "Veronica Costello",
"confidenceScore": "Low"
},
{
"fieldName": "Customer Address Recipient",
"value": "Veronica Costello",
"confidenceScore": "Low"
},
{
"fieldName": "Customer Address",
"value": "6146 Honey Bluff Parkway Calder, Michigan, 49628-7978 United States",
"confidenceScore": "Low"
},
{
"fieldName": "Vendor Name",
"value": "Stripes Shop",
"confidenceScore": "High"
},
{
"fieldName": "Vendor Address Recipient",
"value": "StripesShop",
"confidenceScore": "High"
},
{
"fieldName": "Vendor Address",
"value": "6146 Honey Bluff Parkway Calder, Michigan, 49628-7978 United States",
"confidenceScore": "High"
},
{
"fieldName": "Subtotal",
"value": "$141.00",
"confidenceScore": "High"
},
{
"fieldName": "Total Tax",
"value": "$10.47",
"confidenceScore": "High"
},
{
"fieldName": "Invoice Total",
"value": "$162.37",
"confidenceScore": "High"
}
],
"tables": [
{
"tableName": "Line Items",
"confidenceScore": "Low",
"rows": [
{
"cells": [
{
"columnName": "Description",
"confidenceScore": "Low",
"value": "Endurance Watch SKU: 24-MG01"
},
{
"columnName": "Quantity",
"confidenceScore": "Low",
"value": "1"
},
{
"columnName": "Amount",
"confidenceScore": "Low",
"value": "$49.00"
}
]
},
]
}
]
}
Full Invoice API Response
The above JSON does not include all of the values that can be extracted from Invoices. For full details, see the Invoice page.