An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
Hello @IT Cognity
Thank you for bringing this excellent architectural question to our attention. We understand how critical it is to build a resilient, low-maintenance document extraction architecture for your business.
For documents subject to frequent layout, branding, and template variations—such as utility bills—Microsoft highly recommends transitioning from multiple Custom Template models to a single Custom Neural (or Custom Generative) extraction model.
Recommended Architectural Shift
Instead of using a Custom Classifier to route to supplier-specific Custom Template models, we advise consolidating this workflow.
- Custom Template models rely heavily on static visual structures and positions, making them brittle to spacing or banner changes.
- Custom Neural models utilize deep learning to recognize document semantics and structure, allowing them to reliably extract key-value pairs from semi-structured and unstructured documents entirely independent of strict layout adherence.
Design and Training Best Practices
To effectively design your models so they are resilient to supplier-specific layouts:
- Consolidate to a Single Model: Unify your extraction requirements into a single Custom Neural extraction model. You no longer need to strictly classify and route by supplier if the fields to extract remain the same.
- Diversify Training Data: Curate a highly diverse training dataset that represents the full spectrum of your production variants. Include at least 5-10 samples from each major supplier, intentionally capturing variations in banners, spacing, and styling. The neural model requires this diversity to learn the context of a field (e.g., recognizing an "Amount Due" label) rather than its absolute visual position.
- Evaluate Generative Capabilities: If your utility bills vary wildly, we also recommend evaluating the Custom Generative extraction capabilities, which excel at handling entirely unstructured and dynamic templates with minimal training data.
By adopting a Custom Neural approach trained on a diverse representative dataset, you will maximize extraction accuracy while significantly reducing the operational overhead of constant model retraining.
If you found this to be helpful, please Upvote and mark it as an Accepted answer. This helps others find relevant help to similar issues on the platform.
Best regards,
Andrew S Taylor