Academia.eduAcademia.edu

Outline

A Document Recognition System for Early Modern Latin

2006, Chicago Colloquium on Digital Humanities and …

Abstract

Large-scale digitization of manuscripts is facilitated by high-accuracy optical character recognition (OCR) engines. The focus of our work is on using these tools to digitize Latin texts. Many of the texts in the language, especially the early modern, make heavy use of special characters ...

References (4)

  1. Thomas M. Cover and Peter E. Hart. Nearest neighbor pattern classification. IEEE Transac- tions on Information Theory, 13:21-27, 1967.
  2. Michael Droettboom, Karl MacMillan, and Ichiro Fujinaga. The gamera framework for build- ing custom recognition systems. In Symposium on Document Image Understanding Technolo- gies, pages 275-286, 2003.
  3. Okan Kolak and Philip Resnik. Ocr post-processing for low density languages. In HLT/EMNLP, 2005.
  4. Jeffrey A. Rydberg-Cox. Automatic disambiguation of latin abbreviations in early modern texts for humanities digital libraries. In JCDL, pages 372-, 2003.