Papers by Georgios Th. Papadopoulos
A group of four organizations from the MESH consortium (www.mesh-ip.eu) participated this year fo... more A group of four organizations from the MESH consortium (www.mesh-ip.eu) participated this year for the first time in the High Level Feature Extraction track in TRECVID. The partners were ). We submitted a total of 6 runs, using different variations and configurations over a common model.
Advances in Multimedia …, 2009
Abstract. In this paper we propose a methodology for semantic index-ing of images, based on techn... more Abstract. In this paper we propose a methodology for semantic index-ing of images, based on techniques of image segmentation, classification and fuzzy reasoning. The proposed knowledge-assisted analysis architec-ture integrates algorithms applied on three ...
IEEE Transactions on Systems, Man, and Cybernetics, 2011
Computer vision techniques have made considerable progress in recognizing object categories by le... more Computer vision techniques have made considerable progress in recognizing object categories by learning models that normally rely on a set of discriminative features. However, in con- trast to human perception that makes extensive use of logic-based rules, these models fail to benefit from knowledge that is explicitly provided. In this paper, we propose a framework that can perform knowledge-assisted analysis
Combining Global and Local Information for Knowledge-Assisted Image Analysis and Classification
Eurasip Journal on Advances in Signal Processing, 2007
Abstract A learning approach to knowledge-assisted image analysis and classification is proposed ... more Abstract A learning approach to knowledge-assisted image analysis and classification is proposed that combines global and local information with explicitly defined knowledge in the form of an ontology. The ontology specifies the domain of interest, its subdomains, the ...
Semantic Image Analysis Using a Learning Approach and Spatial Context
Semantics and Digital Media Technologies, 2006
In this paper, a learning approach coupling Support Vector Machines (SVMs) and a Genetic Algorith... more In this paper, a learning approach coupling Support Vector Machines (SVMs) and a Genetic Algorithm (GA) is presented for knowledge-assisted semantic image analysis in specific domains. Explicitly defined domain knowledge under the proposed approach includes objects of the domain of interest and their spatial relations. SVMs are employed using low-level features to extract implicit information for each object of interest
Ontology-Driven Semantic Video Analysis Using Visual Information Objects
Lecture Notes in Computer Science, 2007
In this paper, an ontology-driven approach for the semantic analysis of video is proposed. This a... more In this paper, an ontology-driven approach for the semantic analysis of video is proposed. This approach builds on an ontology infrastructure and in particular a multimedia ontology that is based on the notions of Visual Information Object (VIO) and Multimedia Information ...
Probabilistic combination of spatial context with visual and co-occurrence information for semantic image analysis
2010 IEEE International Conference on Image Processing, 2010
Page 1. PROBABILISTIC COMBINATION OF SPATIAL CONTEXT WITH VISUAL AND CO-OCCURRENCE INFORMATION FO... more Page 1. PROBABILISTIC COMBINATION OF SPATIAL CONTEXT WITH VISUAL AND CO-OCCURRENCE INFORMATION FOR SEMANTIC IMAGE ANALYSIS Georgios Th. Papadopoulos 1,2 , Vasileios Mezaris 2 , Ioannis Kompatsiaris 2 and Michael G. Strintzis 1,2 ...
A Unified Framework for Semantic Event Detection
ABSTRACT
In this poster, we present an approach to contextualized semantic image annotation as an optimiza... more In this poster, we present an approach to contextualized semantic image annotation as an optimization problem. Ontologies are used to capture general and contextual knowledge of the domain considered, and a genetic algorithm is applied to realize the final annotation. Experiments with images from the beach vacation domain demonstrate the performance of the proposed approach and illustrate the added value of utilizing contextual information.

In this chapter, we present our approach to semantic image analysis. Ontologies are used to captu... more In this chapter, we present our approach to semantic image analysis. Ontologies are used to capture a domain's general, spatial and contextual knowledge and a genetic algorithm is applied to fulfil the final annotation. The employed domain knowledge considers high-level information in terms of the concepts of interest of the examined domain, contextual information in the form of fuzzy ontological relations, as well as low-level information in terms of prototypical low-level visual descriptors. To account for the inherent ambiguities in visual information, uncertainty has been introduced and utilized within the spatial relations definition. To illustrate the proposed process, a hypotheses set of graded annotations is produced initially for each image region, and then context is exploited to update appropriately the estimated degrees of confidence. A genetic algorithm is applied as the last step, in order to select the most plausible annotation by utilizing the visual and spatial concept definitions that are included in the domain ontology. Experiments with a collection of photographs derived from two distinct domains demonstrate the performance of the proposed approach.
Combining Content and Context Information for Semantic Image Analysis and Classification
Content-based video retrieval (CBVR) has gained much attention recently due to its wide applicati... more Content-based video retrieval (CBVR) has gained much attention recently due to its wide applications such as digital library, news broadcasting, and web search. A content-based video indexing and retrieval system involves content analysis and feature extraction, content modeling, ...
Semantic Video Analysis Based on Estimation and Representation of Higher-Order Motion Statistics
In this paper, a generic motion-based approach to se-mantic video analysis is presented. The exam... more In this paper, a generic motion-based approach to se-mantic video analysis is presented. The examined video is initially segmented into shots and for every resulting shot appropriate motion features are extracted at fixed time in-tervals. Then, Hidden Markov Models (HMMs) are ...
Towards the automatic classification of pottery sherds: two complementary approaches
ABSTRACT This paper presents two complementary approaches to automatically classify pottery sherd... more ABSTRACT This paper presents two complementary approaches to automatically classify pottery sherds: one that focuses on the sherd’s profile and the other that examines visual features of the sherd’s surface. The methods are validated using a set of pottery sherds that were collected during surveys at the ancient site of Koroneia (Greece), which were carried out by the ‘Ancient Cities of Boeotia’ team (under the directorship of Professor J. Bintliff). Both automatic classification techniques produce good results using different sherd classification criteria, such as shape, production technique and chronology.
Local descriptions for human action recognition from 3D reconstruction data

In this paper, a multi-modal context-aware approach to semantic video analysis is presented. Over... more In this paper, a multi-modal context-aware approach to semantic video analysis is presented. Overall, the examined video sequence is initially segmented into shots and for every resulting shot appropriate color, motion and audio features are extracted. Then, Hidden Markov Models (HMMs) are employed for performing an initial association of each shot with the semantic classes that are of interest separately for each modality. Subsequently, a graphical modeling-based approach is proposed for jointly performing modality fusion and temporal context exploitation. Novelties of this work include the combined use of contextual information and multi-modal fusion, and the development of a new representation for providing motion distribution information to HMMs. Specifically, an integrated Bayesian Network is introduced for simultaneously performing information fusion of the individual modality analysis results and exploitation of temporal context, contrary to the usual practice of performing each task separately. Contextual information is in the form of temporal relations among the supported classes. Additionally, a new computationally efficient method for providing motion energy distribution-related information to HMMs, which supports the incorporation of motion characteristics from previous frames to the currently examined one, is presented. The final outcome of this overall video analysis framework is the association of a semantic class with every shot. Experimental results as well as comparative evaluation from the application of the proposed approach to four datasets belonging to the domains of tennis, news and volleyball broadcast video are presented.
A drawback of current computer vision techniques is that, in contrast to human perception that ma... more A drawback of current computer vision techniques is that, in contrast to human perception that makes use of logic-based rules, they fail to benefit from knowledge that is provided explicitly. In this work we propose a framework that performs knowledge-assisted analysis of visual content using ontologies to model domain knowledge and conditional probabilities to model the application context. A bayesian network (BN) is used for integrating statistical and explicit knowledge and perform hypothesis testing using evidence-driven probabilistic inference. Our results show significant improvements compared to a baseline approach that does not make any use of context or domain knowledge.
Combining multimodal and temporal contextual information for semantic video analysis
In this paper, a graphical modeling-based approach to semantic video analysis is presented for jo... more In this paper, a graphical modeling-based approach to semantic video analysis is presented for jointly realizing modality fusion and temporal context exploitation. Overall, the examined video sequence is initially segmented into shots and for every resulting shot appropriate color, motion and audio features are extracted. Then, Hidden Markov Models (HMMs) are employed for performing an initial association of each shot

In this paper, a gaze-based Relevance Feedback (RF) approach to region-based image retrieval is p... more In this paper, a gaze-based Relevance Feedback (RF) approach to region-based image retrieval is presented. Fundamental idea of the proposed method comprises the iterative estimation of the real-world objects (or their constituent parts) that are of interest to the user and the subsequent exploitation of this information for refining the image retrieval results. Primary novelties of this work are: a) the introduction of a new set of gaze features for realizing user's relevance assessment prediction at region-level, and b) the design of a time-efficient and effective object-based RF framework for image retrieval. Regarding the interpretation of the gaze signal, a novel set of features is introduced by formalizing the problem under a mathematical perspective, contrary to the exclusive use of explicitly defined features that are in principle derived from the psychology domain. Apart from the temporal attributes, the proposed features also represent the spatial characteristics of the gaze signal, which have not been extensively studied in the literature so far. On the other hand, the developed object-based RF mechanism aims at overcoming the main limitation of regionbased RF approaches, i.e. the frequently inaccurate estimation of the regions of interest in the retrieved images. Moreover, the incorporation of a single-camera image processing-based gaze tracker makes the overall system cost efficient and portable. As it is shown by the experimental evaluation, the proposed method outperforms representative global-and region-based explicit RF approaches, using a challenging general-purpose image dataset.
Real-time skeleton-tracking-based human action recognition using kinect data
A group of four organizations from the MESH consortium (www.mesh-ip.eu) participated this year fo... more A group of four organizations from the MESH consortium (www.mesh-ip.eu) participated this year for the first time in the High Level Feature Extraction track in TRECVID. The partners were ). We submitted a total of 6 runs, using different variations and configurations over a common model.
Uploads
Papers by Georgios Th. Papadopoulos