Print media pages are recognized best when they contain photos, graphics and large titles that make them visually characteristic. However, including a lot of small dense textual paragraphs may worsen the recognition. Therefore, it is often better to use only the part with a lot of visual elements instead of entire pages.
The following Figure shows you examples that work and don't, with an explanation under the figure.
Preparing the best reference image for recognizing an article page from a newspaper:
- query image example that should be correctly recognized;
- perfect reference image to accomplish such recognition; it depicts a large photograph and a title; the dense text paragraph on the left-hand side is small;
- sub-optimal ref. image with long text paragraph;
- poor ref. image showing only dense text.