Testing Document Extraction

The Basics

Before jumping into to testing out Butler's document extraction, it's important to first introduce a key concept in the Butler platform, the document extraction Models.

Document extraction models are machine learning models that can be used to extract key pieces of data from your documents. There are two different categories of Document extraction models:

Predefined Models
These are pre-trained models that can be used to extract data from common categories of documents, such as Invoices, ID Cards and Receipts.

They have been trained on millions of documents and images to provide very high accuracy right out of the box.

Custom Models
These are custom-trained models that have been tuned to extract the exact information you want out of your documents.

The possibilities with custom models are endless, but some common examples include Mortgage/Insurance Forms, Shipping Documents, Custom Forms and Tables.

By default, your account will be configured with a library containing all of the predefined models ready to go! You can also create a Custom Model by following the Building your custom model guide.

Select your model

The first step to start testing extraction against your documents is to pick which model to use.

First, navigate to the Library page.

From the Library, you'll be able to explore the different models and find one that is most useful for you! Try either searching for your use case via the search bar or selecting from one of the tags on the left.

Once you have found the model that is most relevant to you, click on the card to view more details about this specific model:

On this page, you'll be able to view more information about the model, including the different fields that can be extracted, as well as supported file types.

Click the Create button to create your first model using the selected model as the base. This newly created model will now appear underneath the Your Models tab.

You should now be on the model details page and ready to your out your new model!

📘
Don't see the model in the library you are looking for?
We are constantly adding new models to the library. If you'd like a new model added, just let us know (via email at [email protected] or via our in-product support chat) and we'd be happy to work with you on your use case!

Upload test documents

Once on the Model Details page, the next step is to upload test documents by clicking on the Upload Documents button.

Upload one or two test documents to get started, wait for extraction to complete, and you'll see each test document, as well as the extracted results:

On the left, you'll see the document you uploaded, as well as all of the different parts of the document that were extracted.

On the right, you'll see the fields that were extracted (name in bold) and the exact values found. When hovering over any field, you'll be able to see the location on the document where it was found.

Feel free to upload up to 50 different test documents to your model. Don't worry, if you close out of the dialog you can always click on the document to view it again, or delete any unnecessary test documents.

Next steps

At this point, you've learned how to test extraction against any of your documents! You're ready to now try building document processing into your product or workflow Using the REST API .

The Basics

Select your model

📘Don't see the model in the library you are looking for?

Upload test documents

Next steps

📘
Don't see the model in the library you are looking for?