Share via

Expected behavior of Document Intelligence classification with dynamic banner content

IT Cognity 0 Reputation points
2026-05-27T12:43:45.7533333+00:00

Hello,

 

We are using Azure AI Document Intelligence for document classification/data extraction in a Salesforce-based digital onboarding flow.

 

We are attaching two sample utility bill documents/images that should belong to the same functional document category. However, classification is not happening correctly.

 

In these samples, there are visible differences between the documents. One difference is the marketing/information banner area. The banner is placed in the same predefined section of the document and has approximately the same layout/position, but the actual banner content/image can change dynamically. This content is managed directly by Marketing/COPS teams without technical involvement.

 

At the same time, there are also other visual/layout differences between the sample documents, such as branding, colors, section styling, spacing, field positioning, and general template presentation.

 

We would like Microsoft’s official guidance for this specific scenario and for the general expected behavior of Document Intelligence classification.

 

Specifically

 

  1. Based on the attached sample documents, is the classification issue more likely caused by the dynamic banner content itself, or by the broader layout/template differences between the documents?

 

  1. For documents where the banner is always placed in the same section, with the same approximate position and dimensions, but only the banner image/content changes dynamically, is Document Intelligence expected to classify the document correctly without retraining?

 

  1. More generally, if dynamic banner content in the same layout can potentially affect classification, what is Microsoft’s recommended approach?

 

  - Should the banner area be masked/excluded before analysis?

 

  - Should the model be trained with multiple representative banner variations?

 

Our goal is to confirm whether the current behavior is expected product behavior, whether the banner alone could explain the issue, or whether the misclassification is more likely caused by the combination of banner changes and other layout/template differences.

Azure Document Intelligence in Foundry Tools

1 answer

Sort by: Most helpful
  1. Andrew Taylor - COREZENN 980 Reputation points Volunteer Moderator
    2026-06-02T19:56:23.68+00:00

    Hello @IT Cognity

    Thank you for bringing this excellent architectural question to our attention. We understand how critical it is to build a resilient, low-maintenance document extraction architecture for your business.


    For documents subject to frequent layout, branding, and template variations—such as utility bills—Microsoft highly recommends transitioning from multiple Custom Template models to a single Custom Neural (or Custom Generative) extraction model.

    Instead of using a Custom Classifier to route to supplier-specific Custom Template models, we advise consolidating this workflow.

    • Custom Template models rely heavily on static visual structures and positions, making them brittle to spacing or banner changes.
    • Custom Neural models utilize deep learning to recognize document semantics and structure, allowing them to reliably extract key-value pairs from semi-structured and unstructured documents entirely independent of strict layout adherence.

    Design and Training Best Practices

    To effectively design your models so they are resilient to supplier-specific layouts:

    1. Consolidate to a Single Model: Unify your extraction requirements into a single Custom Neural extraction model. You no longer need to strictly classify and route by supplier if the fields to extract remain the same.
    2. Diversify Training Data: Curate a highly diverse training dataset that represents the full spectrum of your production variants. Include at least 5-10 samples from each major supplier, intentionally capturing variations in banners, spacing, and styling. The neural model requires this diversity to learn the context of a field (e.g., recognizing an "Amount Due" label) rather than its absolute visual position.
    3. Evaluate Generative Capabilities: If your utility bills vary wildly, we also recommend evaluating the Custom Generative extraction capabilities, which excel at handling entirely unstructured and dynamic templates with minimal training data.

    By adopting a Custom Neural approach trained on a diverse representative dataset, you will maximize extraction accuracy while significantly reducing the operational overhead of constant model retraining.


    If you found this to be helpful, please Upvote and mark it as an Accepted answer. This helps others find relevant help to similar issues on the platform.

    Best regards,

    Andrew S Taylor

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.