Academia.eduAcademia.edu

Outline

Newspaper Page Decomposition Using a Split and Merge Approach

2001

https://doi.org/10.1109/ICDAR.2001.953972

Abstract

Indexing large newspaper archives requires automatic page decomposition algorithms with high accuracy. In this paper, we present our approach to an automatic page decomposition algorithm developed for the First International Newspaper Segmentation Contest. Our approach decomposes the newspaper image into image regions, horizontal and vertical lines, text regions and title areas. Experimental results are obtained from the data set of the contest

FAQs

sparkles

AI

What novel approach does the paper suggest for newspaper page decomposition?add

The paper presents a split and merge technique that segments newspaper pages using extracted vertical and horizontal lines, departing from traditional top-down methods like Nagy's X-Y cut algorithm.

How effective is the proposed method on historical newspaper pages?add

Tested on 20 scanned front pages from the First International Newspaper Segmentation Contest, the method shows promising results despite limitations in handling complex layouts and image noise.

What preprocessing methods are utilized to enhance newspaper image segmentation?add

Only basic preprocessing occurs, primarily filtering small connected components; however, the algorithm does not leverage advanced techniques for image enhancement.

What limitations hinder the accuracy of the segmentation process?add

Key limitations include linear decision-making in the algorithms and inadequate consideration of background lines, leading to potential misclassification of components.

Which future enhancements are recommended for improving the system's performance?add

Future improvements could focus on refining the merge operation, enhancing title detection within text blocks, and integrating font recognition to optimize results further.

References (10)

  1. F. Bapst, R. Brugger, and R. Ingold. Towards an Interactive Document Recognition System. Internal working paper 95- 09, IIUF-Université de Fribourg, March 1995.
  2. B. Gatos and A. Antonacopoulos. First International News- paper Segmentation Contest. http://www.lpa.gr/ contest/, 2001.
  3. B. Gatos, S. L. Mantzaris, K. V. Chandrios, A. Tsigris, and S. J. Perantonis. Integrated algorithms for newspaper page decomposition and article tracking. In ICDAR'99: Fifth Inter- national Conference on Document Analysis and Recogntion, pages 559-562, Bangalore, India, Sept. 1999.
  4. V. Govindaraju, S. W. Lam, D. Niyogi, D. B. Sher, R. Srihari, S. N. Srihari, and D. Wang. Newspaper image understand- ing.
  5. In S. Ramani, R. Chandrasekar, and K. S. R. Anjaneyulu, editors, Knowledge Based Computer Systems, pages 375-84. Narosa Publishing House New Delhi, India, 1990.
  6. O. Hitz, L. Robadey, and R. Ingold. Analysis of Synthetic Document Images. In ICDAR'99: Fifth International Confer- ence on Document Analysis and Recogntion, pages 555-558, Bangalore, India, Sept. 1999.
  7. G. Nagy, S. Seth, and M. Viswanathan. A prototype docu- ment image analysis system for technical journals. Computer, 25(7):10-22, July 1992.
  8. L. Robadey, O. Hitz, and R. Ingold. Segmentation de docu- ments ideaux structure complexe. In CIFED'2000: Colloque International Francophone sur l'Ecrit et le Document, pages 383-392, Lyon, France, jul 2000.
  9. D. Wang and S. N. Srihari. Classification of newspaper image blocks using texture analysis. Computer Vision, Graphics, and Image Processing, 47(3):327-352, Sept. 1989.
  10. A. Zramdini. Study of Optical Font Recognition Based on Global Typographical Features. PhD thesis, University of Fribourg, 1995.