This article explains how to configure AI-powered processing in your AODocs libraries to automate document processing and reduce manual data entry.
Automatically generated table of contents
What is metadata extraction?
Manually extracting important information from documents is time-consuming, tedious, and error-prone.
If you set up AI processing in your library, you can extract important information automatically from your documents to populate properties. This saves time for your teams and reduces the risk of errors.
AI for AODocs uses optimized vectorization, OCR technologies, and your Large Language Models (LLM). It intelligently reads and extracts data from your documents, streamlining your document management workflow.
Prerequisites
Before you begin, you must:
- Make sure that AI for AODocs is set up on your domain and activated in your library.
For information on setting up and activating AI for AODocs, contact the AODocs Support team by email at support@aodocs.com or open a ticket.
During this process, the AODocs Support team will allowlist a service account on your domain.
- Make sure you have library administrator permissions in your library.
- Activate the new library administration in your library – learn more: Activate or deactivate early-access features on your domain.
- Add the allowlisted service account as a library administrator in your library. To do this, do one of the following:
- open the library's security settings and add the service account provided by AODocs Support as a library administrator – learn how to do this in Secured Folders and Document Management libraries
- click Continue in the dialog that opens the first time you use AI Processing
Note: Only library administrators can create new AI processing configurations.
Create AI Processing configurations
You can create and configure as many AI processing configurations as you like for a given document class. This enables conditional AI processing based on specific criteria. For example, if processing invoices, you can define different AI configurations according to different supplier names.
1. Open the library administration.
Important: You must activate the new library administration first. Learn more: Activate or deactivate early-access features in libraries.
2. Click AI Processing in the left navigation menu.
3. Click the + Create Configuration button.
4. Enter the following information in the dialog:
- Name: Choose a name for your AI processing configuration
- Document class: Select from the list of document classes defined in your library
In our example, we selected the SOP document class (Standard Operating Procedure).
5. Click Create to begin configuring the AI processing.
Configure AI Processing
To fully configure your AI processing configuration:
- define when AI processing is triggered
- specify the properties to process
- enhance AI accuracy with ground truth data
- set up optional advanced AI processing features
Define when AI processing is triggered
Under When this happens..., configure the events that will trigger AI processing to start.
1. Click Document Created to see the available options:
- Document created (default option): processing starts when you save a new document.
- Attached file added: processing starts whenever you add a new attached file to an AODocs document.
-
Document state changed: processing starts when the document transitions:
- From a specific state to another
- From a specific state (to any other)
- To a specific state (from any other)
Note: AI processing can handle multiple attached files.
2. If required, define additional filters to trigger AI processing based on specific conditions. You can configure filters to check one or more property values before triggering the processing.
In the example below, AI processing is triggered when a document is created with the Buyer Name property set to ACME.
3. Click Update.
In our example, AI processing is triggered:
- when a document in the SOP document class is created
- when the document transitions from "Pending approval" to "Approved"
- when the document transitions from a draft state to any other state
In addition, we only want to process documents where the "Site" property is "Riverside" and the "Document type" property is "Maintenance procedure".
Specify the properties to process
You can add, modify, or delete properties at any time.
1. Click + Select properties to process to open a side panel where you can select which properties AI processing will extract from your documents.
2. Click + Add property.
In our example, we selected several properties to process.
3. Select the properties you want to process.
4. Click Update to confirm your choices.
5. For each selected property, you can improve extraction accuracy. You can:
- Create an alias to provide an alternative label that will be used by the LLM instead of the property name – this is useful when the property name in your document class is user-friendly but not precise enough for AI to understand.
-
Add an instruction to improve extraction accuracy – you can:
- add additional context about the property
- specify the usual position of the data within the document
- define what the property is NOT, to avoid confusion
In our example, we added the alias and an instruction for the "Document type" property.
You can now do one of the following:
- set up one or more reference document classes
- set up optional advanced AI processing features
- finalize and save your AI processing configuration
Reference document classes: enhance AI accuracy with ground truth data
The Reference Document Class feature acts as a "source of truth" for your organization’s metadata. By connecting your AI extraction process to a pre-validated database of information, you can make sure that the data entering your system is consistent, verified, and high-quality.
How it works
When AI extracts metadata from a document (such as a company name or a VAT number), it doesn't just rely on what it "sees" on the page. Instead, it attempts to match that extraction against your reference document class.
Smart matching: if a match is found (even if not 100% identical), the system pulls additional, validated data from the reference class and automatically populates your document properties.
Data validation: this process eliminates typos or "hallucinations" by forcing the document to adopt the pre-approved values from your database.
Confidence guardrails: if no match is found in your reference data, AI marks the metadata confidence score as "low", alerting users that the information requires manual verification.
Follow these steps to link your document class to a reference source and automate your metadata validation.
Create a mapping to your reference class
1. In your AI processing configuration and click Add referential based on other document classes.
2. Select the document class you want to pull data from.
Tip: Select the class that contains your "ground truth" data. For example, a client database or a vendor list.
In our example, we want to extract data from the "Contractors" document class.
3. If required, change the configuration name.
Provide a unique name for this specific mapping to keep your workspace organized.
Map your key properties
The key properties are the specific pieces of information AI uses to look up a record in the reference class.
1. Click Add key mapping.
2. Pair your properties: select one property from your current class (the one being filled by AI) and the corresponding property in your reference class.
In our example we mapped the AI-extracted "Company Name" to the "Central company list" property.
When the values of these properties are identical, other properties in the current document class are filled in by pulling values from corresponding properties in the reference document class.
Define additional data to pull
Once a match is found via the key property, you can automate the filling in of other related fields.
1. Activate the switch Additional properties to extract automatically from reference class.
2. Click Add additional mapping to select a property that will be filled in automatically from the reference class. Do this as many times as required.
3. Select the destination property in your current class and the source property from the reference class.
Repeat this for all fields you want to automate. For example, "Address", "Phone Number", "VAT".
4 Click Update to confirm your choices.
Set up optional advanced AI processing features
These settings are optional but can help optimize your AI processing for specific use cases.
1. Activate the switch Attachments contain hand-written text if your documents contain handwritten text. This leverages the vision capabilities of your LLM instead of standard OCR (Optical Character Recognition).
Note: Activating this option increases the usage cost of the LLM.
2. Under If the Property Has Values select how AI processing should handle properties that already contain values:
-
Do not overwrite existing values
AI processing will fill in empty properties only. Use this when some properties are pre-filled by users and you don't want the AI to override their values. -
Overwrite existing values
Use this when you want AI processing to replace any existing property values.
3. Under Extract information from you can restrict which file types to process.
By default, all attached files are processed.
You can click Only selected file types and select the file types you want to process from the drop-down list. Only attached files of the selected types will be processed.
The following file types are available:
- PDF Document
- Google Docs Document
- Google Sheets Spreadsheet
- Google Slides Presentation
- Google Drawing
- Microsoft Word Document
- Microsoft Word Document (OpenXML)
- Microsoft Excel Spreadsheet
- Microsoft Excel Spreadsheet (OpenXML)
- Microsoft PowerPoint Presentation
- Microsoft PowerPoint Presentation (OpenXML)
The following image formats are available:
- JPEG Image
- PNG Image
- TIFF Image
- SVG Image
- WebP Image
4. Activate the switch Add instructions to make AI processing more accurate and relevant to add comprehensive instructions for more complex processing scenarios. Unlike the property-level instruction field, this area allows for longer, more detailed instructions.
You can:
- run complex queries
- evaluate the processed document against a set of criteria
- provide overarching guidelines for the entire extraction process
5. Activate the switch LLM Provider to change your default domain LLM provider. Select either:
- OpenAI
Finalize and save your AI processing configuration
1. Review your mappings to make sure the logic flows correctly from the reference source to your new document.
2. Click Save to activate the configuration.
Reminder: AI processing configurations don't auto-save.
Test your configuration
You can test your AI processing configuration on an existing document at any time:
1. Click the Preview AI processing button.
2. Select a document from your document class.
3. Click Run test.
The document is processed (but not updated), and the extracted values are displayed alongside confidence scores.
Low confidence scores don't necessarily mean the values are incorrect. It simply indicates that the LLM is less certain about the accuracy of the extraction. We recommend reviewing low-confidence extractions to check accuracy.
Next steps
After creating an AI processing configuration:
1. Test the configuration with various documents from your document class.
2. Monitor extraction accuracy and adjust instructions as neededSet up optional advanced AI processing features
3. Create additional configurations for different document types or scenarios.
Troubleshooting
AI processing isn't starting
- Check that your trigger conditions are correctly configured.
- Check that filters (if used) match your document properties.
- Make sure AI for AODocs is activated in your library.
Extraction accuracy is low
- Add or refine property-specific instructions.
- Consider using aliases for properties with unclear names.
- Use the advanced instruction field for additional context.
- Test with different sample documents.