The rich manuscript heritage of Italy, preserved in archives and libraries, is becoming increasingly accessible to a wider audience through dedicated digitization initiatives. However, the interpretation of these manuscripts often proves... more
The shipping and logistics industry is undergoing a transformative shift driven by rapid advancements in technology. Innovations such as artificial intelligence (AI), the Internet of Things block chain, automation, and big data analytics... more
The rapid growth of digital data has led to the widespread creation and storage of digital images containing text. The extraction and use of textual information might be advantageous for various kinds of domains. Text detection in natural... more
Optical character recognition (OCR) deals with the process of identification of alphabets and various scripts. Optical character recognition (OCR) is one of the trending topics of all time. Optical character recognition (OCR) is used for... more
Old Uyghur manuscripts remain largely inaccessible due to the absence of optical character recognition (OCR) systems aligned with modern scholarly practices. To address this, we present a work‑in‑progress OCR approach using fine-tuning of... more
Old Uyghur manuscripts remain largely inaccessible due to the absence of optical character recognition (OCR) systems aligned with modern scholarly practices. To address this, we present a work‑in‑progress OCR approach using fine-tuning of... more
Here, we introduce the Historical Kannada Handwritten Palm Leaf (HKHPL) Dataset, the pioneering collection of historical palm leaf manuscripts written in Kannada. It includes original ground-truth and binarised images. This dataset is... more
Extracting texts from images with complex backgrounds is a major challenge today. Many existing Optical Character Recognition (OCR) systems could not handle this problem. As reported in the literature, some existing methods that can... more
The use of advanced computational methods for the analysis of digitised texts is becoming increasingly popular in humanities and social science research. One such technology is Handwritten Text Recognition (HTR), which generates... more
Perkembangan teknologi informasi dan komunikasi telah membawa perubahan signifikan dalam berbagai aspek kehidupan, termasuk dalam pengelolaan administrasi di institusi pendidikan. Pengarsipan dokumen, khususnya surat menyurat, merupakan... more
In order to secure the usability of records over the retention period, the archives conduct digital preservation actions which may entail conversion, migration and other procedures. When using digitization for creating copies in the form... more
The cultural and regional diversity across the world and specifically in India has given birth to a large number of writing systems and scripts having a variety of character sets. For scripts having a larger character set, just a simple... more
The National Library of Finland Digitisation Programme 2025–2028 reflects the goals of sustainable development. With its four-year digitisation programmes, the National Library strives to ensure the systematic selection of the material to... more
The detection and extraction of scene text from document images is one of the challenging research areas. Many researchers have detected and extracted the text from plain text background. But the multi-oriented scene text detection is one... more
The code import the YoutubeTranscriptionApi from youtube_transcription_api libray, The YouTube video ID is defined. The transcription data for the given video ID is fetched using get_transcription method. 6) This methodology outlines the... more
Optical Character Recognition (OCR) has been a transformative technology for decades, enabling the conversion of printed or handwritten text into machine-readable formats. Traditional OCR applications, built on rule-based algorithms and... more
This article presents the findings and discussions of the “Digital Transcription of Ottoman Texts” workshop held at Marmara University in December 2023, focusing on current efforts, challenges, and innovations in the automatic... more
Diagnostic procedure, one of the most essential steps in healthcare, determines the protocol for the treatment of a patient. Doctors can provide a more accurate diagnosis and better treatment by comparing patients' recent and older test... more
Tariff exemptions are fundamental to attracting Foreign Direct Investment (FDI) into the manufacturing sector, though the associated administrative processes present areas for optimization for both investing entities and the national tax... more
Session Title: FAIR Periodicals: What Does It Really Mean to 'Digitize' a periodical? Panel titles: 1. FAIR Design: Metadata Strategies for Image-Driven Periodicals. 2. From OCR to AI: New Challenges and Tools in Text-Image Based... more
This article examines the pedagogical adaptation of Textual Communities, a digital tool originally developed for collaborative research in textual scholarship, to teach paleography at the undergraduate level within a liberal arts context.... more
Research Article Evaluating open-ended exams presents significant challenges in terms of time management and consistency in educational processes. This study aims to develop an iOS-based mobile application, "Exam Reader" to streamline the... more
There are two most popular writing styles of Urdu i.e. Naskh and Nastaliq. Considering Arabic OCR research, ample amount of work has been done on Naskh writing style; focusing on Urdu, which uses Arabic character set commonly used... more
Over 8,024 wildfire incidents have been documented in 2024 alone, affecting thousands of fatalities and significant damage to infrastructure and ecosystems. Wildfires in the United States have inflicted devastating losses. Wildfires are... more
There's a good deal of change in the world of insurance. The growth of the Digital Era has created open doors for technology-driven insurance claims management mechanisms. At the same time, mobile technology and AI have been used to be... more
This paper explores the methodological and technological challenges involved in reconstructing the history of amateur filmmaking in postwar Italy through the analysis of specialized periodicals. Originating in a PhD project on Ferrania, a... more
Optical character recognition for the English text may be considered one of the most important research topics, whether, printed or handwritten. Although excellent results have been reached in the English text, there is a lack of this... more
The paper is a comprehensive review of the current research trends in the area of Arabic language especially state-of-the-art approaches to highlight the current status of diverse research aspects of that area to facilitate the adaption... more
Abstract (ITA): Per un’istituzione culturale far riferimento all’ecosistema IIIF non significa certo uniformarsi a una moda tecnologica del momento, ma molto più opportunamente rappresenta a tutt’oggi la via migliore per assicurare ai... more
Dieser Beitrag stellt eine neuartige Methode zur optischen Zeichenerkennung ( Optical Character Recognition, OCR) speziell für Textvorlagen des 17. Jahrhunderts vor. Anstatt ein neues OCR-Verfahren zu entwickeln, werden zwei etablierte... more
The handwriting recognition is the scheme of converting text symbolized in the spatial form of graphical symbols into its figurative depiction. Handwritten characters have been the most accredited technique of collecting, storing and... more
Between the 16th and 18th centuries, Spanish and Indigenous people produced millions of colonial documents in diverse calligraphic styles in Latin American territories. Accessing their rich information requires specialised palaeographic... more
Contemporary healthcare techniques involve maintaining a physical record of doctor visits, prescriptions, medical findings, tests, etc. Receiving our medical reports on WhatsApp, calling our doctor, or searching for medicines online is... more
Cuneiform language is an old language that was invented by the people of Sumerian nation. It is an essential language for many archeologists. Especially who are interested in studying and investigating the old nations of Iraq. Dealing... more
The paper demonstrates that malware can be detected through supervised machine learning. Using machine learning for malware detection has technical and commercial benefits. This is because malware detection in the current context is about... more
The form processing systems commercially available include a verification step during which a human operator verifies the output provided by the system to ensure 100% accuracy. In order to reduce the time and the cost of such a stage, the... more
Presentación de los proyectos digitales sobre textos mortuorios de los reinos Antiguo y Medio egipcios desarrollados en la UAH.
The admission documentation process for Indian students is often time-consuming and labourintensive. This research explores leveraging Optical Character Recognition (OCR), generative Artificial Intelligence (AI), and automation to... more
Handwriting character recognition is an active and challenging area of research in the pattern-recognition field. Nowadays, many algorithms are introduced to achieve the highest accuracy. An improved recognition result could be... more
In this presentation, we briefly describe five projects of analysis and interpretation of the mortuary texts written in Earlier Egyptian, that is, ancient and middle Egyptian (between 2500 and 1500 BC). Most of the texts under study are... more
https://www.transkribus.org/model/agapet-13th-century As part of the Agapet project, which aims to train with Transkribus HTR models to digitize Christian Arabic literature from the 9th to the 19th century, we present a new model,... more
Persons of visual impairment make up a growing segment of modern society. To cater to the special needs of these individuals, society ought to consider the design of special constructs to enable them to fulfill their daily necessities.... more
Newspapers are among the most comprehensive textual records from late Ottoman Palestine that were produced by the inhabitants of the region themselves. When subjected to ‘close reading,’ they can provide additional information about... more
This research focuses on real-time surveillance systems as a means for tackling the issue of non-compliance with helmet regulations, a practice that considerably amplifies the risk for motorcycle drivers or riders. Despite the... more
Cuneiform documents, the earliest known form of writing, are prolific textual sources of the ancient past. Experts publish editions of these texts in transliteration using specialized typesetting, but most remain inaccessible for... more
At the Faculty of Humanities and Social Sciences, University of Zagreb, the Transcribathon Zagreb event took place from November 7 to 11, 2022. This event combined a scientific-professional conference and a student competition in... more
This project focuses on developing a personalized allergen notification system to help individuals identify potential allergens in food products based on their dietary preferences and known allergens. A key feature of this system is... more
utomatic transcription technology, or handwritten text recognition (HTR), unlocks new opportunities for analysing historical texts by transforming document images into machine-readable formats. This paper explores the development and... more
The retail industry, especially grocery and liquor store, faces unprecedented challenges due to a 1.2 million worker shortage in the U.S., leading to deteriorating service quality, inadequate inventory management, and alarming levels of... more