Papers by Lefteris Kozanidis
Ontology-driven personalized query refinement
Journal of Web Engineering, Jun 1, 2009
Google, Inc. (search). ...

Παρζμβαςθ τθσ ζρευνάσ μασ ________________________________________________ 1.2 υνειςφορά τθσ Διατ... more Παρζμβαςθ τθσ ζρευνάσ μασ ________________________________________________ 1.2 υνειςφορά τθσ Διατριβισ _________________________________________ 1.3 Δομι τθσ Διατριβισ _______________________________________________ 2. ημαςιολογικά Δίκτυα και Αποςαφήνιςη _____________________________ 2.1 Ειςαγωγι: _______________________________________________________ 2.2 θμαςιολογικά Δίκτυα ____________________________________________ 2.2.1 Σο ςθμαςιολογικό δίκτυο WordNet __________________________________________ 2.2.2 EuroWordNet ____________________________________________________________ 2.2.3 Σο ςθμαςιολογικό δίκτυο BalkaNet και το Greek WordNet _______________________ 2.2.4 Σο ςθμαςιολογικό δίκτυο GeoWordNet _______________________________________ 2.3 Αποςαφινιςθ εννοιϊν ____________________________________________ 2.3.1 Σο πρόβλθμα τθσ πολυςθμίασ _______________________________________________ 2.3.2 Προςεγγίςεισ για τθν αποςαφινιςθ τθσ ςθμαςίασ λζξεων ________________________ 2.3.3 Σεχνικζσ αποςαφινιςθσ βαςιςμζνεσ ςε αποκθκευμζνθ γνϊςθ ____________________ 2.3.4 θμαςιολογικι ομοιότθτα __________________________________________________ 2.3.5 Μετρικζσ υπολογιςμοφ ςθμαςιολογικισ ομοιότθτασ ____________________________ 2.3.6 Αλγόρικμοι αποςαφινιςθσ _________________________________________________ 2.3.7 Προτεινόμενθ τεχνικι αποςαφινιςθσ ________________________________________ 2.4 Πειραματικζσ μετριςεισ-Αποτελζςματα _____________________________ 2.5 υμπεράςματα ___________________________________________________ 3. Κατηγοριοποίηςη Ιςτοςελίδων ______________________________________ 3.1 Ειςαγωγι: _______________________________________________________ 3.2 Σο πρόβλθμα τθσ κατθγοριοποίθςθσ _________________________________ 3.3 Είδθ κατθγοριοποίθςθσ ιςτοςελίδων _________________________________ 3.4 Θεματικι Κατθγοριοποίθςθ ________________________________________ 3.5 Είδθ χαρακτθριςτικϊν κατθγοριοποίθςθσ κατά τθν διαδικαςία κατθγοριοποίθςθσ Ιςτοςελίδων _____________________________________________________ 3.6 Προτεινόμενο φςτθμα κατθγοριοποίθςθσ ____________________________ 3.6.1 Εκπαίδευςθ του κατθγοριοποιθτι ___________________________________________ 3.6.2 Τλοποίθςθ του κατθγοριοποιθτι ____________________________________________ 3.6.3 Αλγόρικμοσ κατθγοριοποίθςθσ ιςτοςελίδων ___________________________________ 3.6.4 Πειραματικι αξιολόγθςθ ___________________________________________________ 4. Προςκομιςτζσ Πληροφορίασ ________________________________________ 4.1 Ειςαγωγι: _______________________________________________________ 8 4.1 Προςκομιςτζσ Γενικοφ κοποφ ______________________________________ 4.1.1 Τποδομι ενόσ ςυςτιματοσ προςκόμιςθσ γενικοφ ςκοποφ ________________________ 4.1.2 Προτεινόμενθ Αρχιτεκτονικι υςτιματοσ Προςκόμιςθσ Γενικοφ κοποφ ____________ 4.2 Focused Crawlers _________________________________________________ 4.2.1 χετικι ζρευνα ___________________________________________________________ 4.2.2 Αρχιτεκτονικι του εςτιαςμζνου προςκομιςτι. _________________________________ 4.2.3 Αξιολόγθςθ ______________________________________________________________ 4.2.4 Παρατθριςεισ ____________________________________________________________ 5. Αυτόματη Δημιουργία Γεωγραφικά Προςανατολιςμζνου Ευρετηρίου Με Χρήςη Προςκομιςτϊν Δεδομζνων Διαδικτφου _______________________________ 5.1 Ειςαγωγι _______________________________________________________ 5.2 χετικι Ζρευνα ___________________________________________________ 5.3 Γεωγραφικα Εςτιαςμενθ Προςκομιςθ Ιςτοςελιδων _____________________ 5.3.1 Αναγνωρίηοντασ ελίδεσ με Γεωγραφικό περιεχόμενο ___________________________ 5.3.2 Σαξινόμθςθ των διευκφνςεων URLs ςτο μζτωπο του προςκομιςτι ________________ 5.4 Καταςκευαηοντασ Ευρετθριο Μθχανθ Αναηθτθςθσ Με Γεωγραφικεσ Αναφορεσ105 5.4.1 Τποβολι Γεωγραφικϊν Ερωτθμάτων το Ευρετιριο Γεωγραφικϊν Αναφορϊν_______ 5.5 Πειραματικι αξιολόγθςθ _________________________________________ 5.5.1 Απόδοςθ του Γεωγραφικά Εςτιαςμζνου Προςκομιςτι __________________________ 5.5.2 Απόδοςθ Ανάκτθςθσ με Ζμφαςθ ςε Γεωγραφικά δεδομζνα ______________________ 6. Χρήςη όψεων για ερωτήματα που ςυμπεριλαμβάνουν επϊνυμεσ οντότητεσ. 6.1 Ειςαγωγι ______________________________________________________ 6.2 χετικι Ζρευνα __________________________________________________ 6.3 Αναπαράςταςθ με χριςθ Facets των ερωτθμάτων που αφοροφν Named Entities ______________________________________________________________ 6.3.1 Εξάγοντασ τισ κατθγορίεσ των ερωτθμάτων ΝΕ ________________________________ 6.3.2 Επιλογι των όρων όψεων για τθν αναπαράςταςθ των κατθγοριϊν των ερωτθμάτων που ςυμπεριλαμβάνουν επϊνυμεσ οντοτιτεσ _________________________________ 6.4 Αναηιτθςθ με χριςθ όψεων για ερωτιματα που ςυμπεριλαμβάνουν επϊνυμεσ οντότθτεσ ______________________________________________________ 6.5 Πειραματικι Αξιολόγθςθ _________________________________________ 7. Αναγνϊριςη λζξεων κλειδιϊν για την ελληνική γλϊςςα ςε διευθφνςεισ URL

Advances in Mobile Learning Educational Research, 2022
The current study presents an adaptable light game engine, which is used to produce interactive e... more The current study presents an adaptable light game engine, which is used to produce interactive educational settings focused on cultural heritage. The tool is implemented using inexpensive and open source technologies. In this paper we first discuss the architecture of the application and we then present two games developed by using the proposed engine. The produced games are multi-user and support the collaboration and communication among learners and among learners and instructors. Learners earn marks, badges and certificates as they study the material and complete the quizzes. Various evaluation experiments have been realized to understand the suitability of the produced content in educational activities. The evaluation results of the authentic educational actions were quite positive and supportive by both students and teachers.
Ontology-driven personalized query refinement
Journal of Web Engineering, 2009
The most popular way for finding information on the Web is go to a search engine, submit a query ... more The most popular way for finding information on the Web is go to a search engine, submit a query that describes an information need and receive a list of results that relate to the information soug...
Metadata and Semantics
In this paper, we propose a novel site customization model that relies on a topical ontology in o... more In this paper, we propose a novel site customization model that relies on a topical ontology in order to learn the user interests as these are exemplified in their site navigations. Based on this knowledge, our mechanism personalizes the site's content and structure so as to meet particular user needs. Experimental results demonstrate that our model has a significant potential in accurately identifying the user interests and show that site customizations that rely on the detected interests assist web users experience personalized navigations in the sites' contents.
Arxiv preprint arXiv:0903.2544, Mar 14, 2009
Abstract: When searching the web, it is often possible that there are too many results available ... more Abstract: When searching the web, it is often possible that there are too many results available for ambiguous queries. Text snippets, extracted from the retrieved pages, are an indicator of the pages' usefulness to the query intention and can be used to focus the scope of search results. In this paper, we propose a novel method for automatically extracting web page snippets that are highly relevant to the query intention and expressive of the pages' entire content. We show that the usage of semantics, as a basis for focused retrieval, ...

Advances in Mobile Learning Educational Research, 2022
The current study presents an adaptable light game engine, which is used to produce interactive e... more The current study presents an adaptable light game engine, which is used to produce interactive educational settings focused on cultural heritage. The tool is implemented using inexpensive and open-source technologies. In this paper we first discuss the architecture of the application and we then present two games developed by using the proposed engine. The produced games are multiuser and support the collaboration and communication among learners and among learners and instructors. Learners earn marks, badges and certificates as they study the material and complete the quizzes. Various evaluation experiments have been realized to understand the suitability of the produced content in educational activities. The evaluation results of the authentic educational actions were quite positive and supportive by both students and teachers.

In this paper, we experimentally study the problem of querying the web in a hybrid language, name... more In this paper, we experimentally study the problem of querying the web in a hybrid language, namely Greeklish. Greeklish is the transliteration of Greek in Latin characters of the ASCII code. Although Greeklish emerged as a convenient mean for the creation and distribution of digital data at a time when Unicode Transformation Format was not supported for the Greek alphabet, nevertheless it is still being utilized as a matter of habit or need. Today, a considerable amount of the Greek web data contains pages written in Greeklish. Although, these are less official web pages and they appear mainly in blogs or forums, their contents may be of good quality and usefulness to the Greek online information seekers. However, the paradox of searching the Greek web is that search engines perceive Greeklish as a totally different language form Greek and as such they do not return Greek pages in response to Greeklish queries. As a consequence, users who issue Greeklish queries (sometimes for tech...
Keywords Identification within Greek URLs

Jdim, 2009
One of the most important measures for estimating the impact of scientific publications is the nu... more One of the most important measures for estimating the impact of scientific publications is the number of citations they have received. Today, there exist several tools and metrics to evaluate the relative importance of individual papers and publication venues based on their citation distribution. Despite their acknowledged usefulness, most of the existing techniques rely on quantitative rather than qualitative aspects of the citation analysis and thus they are inherently limited in conveying any specific information about the author opinions towards the papers they cite. In this paper, we introduce a method that combines text mining and lexical analysis in order to elucidate the authors' attitude towards the works they cite in their publications. We have applied our method on a set of 4,520 citations that span to 40 publications and tried to shed some light on the following issues. How often authors express an opinion about the papers they cite? Do all authors who cite the same publication share a common understanding on that publication's impact? Does the citations' context influence people's perception of the referred papers and how we can take context into consideration? Can we define a qualitative measure for estimating the impact of scientific publications? Our evaluation shows that although authors do not always express their personal opinions about the papers they cite, their judgments (when articulated) have a great influence on the papers' perceived importance.

Towards Faceted Search for Named Entity Queries
Lecture Notes in Computer Science, 2009
ABSTRACT A considerable fraction of the web queries contain named entities. This, coupled with th... more ABSTRACT A considerable fraction of the web queries contain named entities. This, coupled with the fact that a proper name might refer to multiple entities, imposes the ever-increasing need that search engines handle efficiently named entity queries. In this paper, we present a technique that automatically identifies the distinct subject classes to which a named entity query might refer and selects a set of appropriate facets for denoting the query properties within every class. We also suggest a method that examines the distribution of the identified query facets within the contents of the query matching pages and groups search results according to their entity denotation types. Our preliminary study shows that our technique identifies useful facets for representing the named entity query properties in each of their referenced subject classes.
Lecture Notes in Computer Science, 2011
In the last decades the explosion of ICT has opened up new avenues regarding peoples' accessibili... more In the last decades the explosion of ICT has opened up new avenues regarding peoples' accessibility to new job opportunities. Current technological advances in conjunction with people's online presence provide a great opportunity to automate the recruitment process and make it more effective. In this paper, we propose a novel approach for improving the efficiency of erecruitment systems. Our approach relies on the linguistic analysis of data available for job applicants, in order to infer the applicants' personality traits and rank them accordingly. To showcase the functionality of our method, we employed it in a web based e-recruitment system that we implemented.

2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System, 2007
The explosive growth of online data and the diversity of goals that may be pursued over the web h... more The explosive growth of online data and the diversity of goals that may be pursued over the web have significantly increased the monetary value of the web traffic. To tap into this accelerating market, web site operators try to increase their traffic by customizing their sites to the needs of specific users. Web site customization involves two great challenges: the effective identification of the user interests and the encapsulation of those interests into the sites' presentation and content. In this paper, we study how we can effectively detect the user interests that are hidden behind navigational patterns and we introduce a novel recommendation mechanism that employs web mining techniques for correlating the identified interests to the sites' semantic content, in order to customize them to specific users. Our experimental evaluation shows that the user interests can be accurately detected from their navigational behavior and that our recommendation mechanism, which uses the identified interests, yields significant improvements in the sites' usability.
Ontology-Based Adaptive Query Refinement

Proceeding of the eleventh international workshop on Web information and data management - WIDM '09, 2009
In this paper, we experimentally study how web searchers select the keywords to describe their in... more In this paper, we experimentally study how web searchers select the keywords to describe their information needs and specifically we investigate whether query keyword selections are influenced by the results the users reviewed for a previous search. For our study, we determine two types of searches: (i) those in which users define their queries without any external influence and which we call tightly-focused and (ii) those in which users define their queries under some external influence and which we call looselyfocused. Based on the analysis of the user querying trends and web visits on the query results, we propose a model that tries to capture the results' influence on the specification of the subsequent user queries. The application of our model on a search trace of 19,250 queries issued to Google by 18 users over a period of two months reveals that in overall search results influence the specification of 12.79% of the web queries.

Evaluating the correspondence of educational software to learning theories
Proceedings of the 17th Panhellenic Conference on Informatics - PCI '13, 2013
ABSTRACT As new technologies emerge, more and more people depend on them for a variety of purpose... more ABSTRACT As new technologies emerge, more and more people depend on them for a variety of purposes. Now more than ever there is a tendency for technological implications to substitute for face-to-face communication and education. In this paper we attempt to investigate whether the usability and pedagogical factors for quality educational platforms meet the expectations of three dominant learning theories of the past century, which, namely are: behaviorism, cognitivism and constructivism. We assign specific factors derived from 9 evaluation models to the 3 learning theories. A list of 15 questions was produced to help evaluators in the assessment of the educational software. Then we evaluated 11 educational websites that aim to help anglophone students improve their language skills e.g. through grammar and spelling exercises. The results show the level of correspondence of these educational websites to the learning theories.
Lecture Notes in Computer Science, 2008
In this paper we present a novel approach for building a focused crawler. The goal of our crawler... more In this paper we present a novel approach for building a focused crawler. The goal of our crawler is to effectively identify web pages that relate to a set of predefined topics and download them regardless of their web topology or connectivity with other popular pages on the web. The main challenges that we address in our study concern the following. First we need to be able to effectively identify the pages' topical content before these are fully downloaded and processed. Thereafter, we need to obtain a well-balanced set of training examples that the crawler will regularly consult in its subsequent web visits.

Proceeding of the 2nd ACM workshop on Improving non english web searching - iNEWS '08, 2008
As the Web becomes an integral part of our everyday life and the Internet-literate population gro... more As the Web becomes an integral part of our everyday life and the Internet-literate population grows rapidly, the Search Engine market is steadily gaining a high monetary value. Unfortunately, today, the distribution of the search market share is dominated by English-speaking users and stakeholders, basically because English is the lingua franca of the Web. Thus, although the majority of the Web users are non-English native speakers, they naturally gravitate to using English in order to explore the plentiful Web content. In this paper, we propose a query selection mechanism for assisting users perform successful non-English Web searches. Our mechanism combines linguistic analysis and Web mining techniques and aims at assisting users select informative and wellspecified queries for expressing their information needs in languages other than English. Our technique is validated on a dataset of 70 Greek queries issued to Google search engine over a period of 3 weeks. Obtained results demonstrate that our query selection mechanism yields improved retrieval performance compared to existing non-English search strategies and as such we believe that it can be fruitfully deployed for other natural languages.

The explosive growth of online data and the diversity of goals that may be pursued over the web h... more The explosive growth of online data and the diversity of goals that may be pursued over the web have significantly increased the monetary value of the web traffic. To tap into this accelerating market, web site operators try to increase their traffic by customizing their sites to the needs of specific users. Web site customization involves three great challenges: (i) the accurate identification of the user interests in the sites' content (ii) the detection of the user goals in their site visits and (iii) the encapsulation of the user interests and goals into the sites' presentation and content. In this paper, we study how we can effectively identify the user interests and goals in their site visits and we evaluate their correlation as this is exemplified in the users' navigational patterns and the site's semantic content and structure. Based on our findings we propose a novel recommendation mechanism that employs web mining techniques for suggesting customized site views that satisfy both the user preferences and goals. Our experimental evaluation shows that the user site interests and interaction goals can be accurately detected from the users' navigational behavior and that our recommendation model, which uses the identified user preferences and goals yields significant improvements in the sites' usability.
Uploads
Papers by Lefteris Kozanidis