Papers by Georgeta Bordea
We participated in the NLP Unshared Task in PoliInformatics 2014 with Saffron, a system that prov... more We participated in the NLP Unshared Task in PoliInformatics 2014 with Saffron, a system that provides insight into the Financial Crisis by analysing main topics of discussion and experts associated with these topics. Saffron applies term extraction, topical hierarchy construction, expert finding and expert profiling to provide a set of tools that allow users to investigate what was the financial crisis and who was the financial crisis.
In this paper we present a comparative analysis of two series of conferences in the field of Comp... more In this paper we present a comparative analysis of two series of conferences in the field of Computational Linguistics, the LREC conference and the ACL conference. Conference proceedings were analysed using Saffron by performing term extraction and topical hierarchy construction with the goal of analysing topic trends and research communities. The system aims to provide insight into a research community and to guide publication and participation strategies, especially of novice researchers.
Traditionally, relevance assessments for expert search have been gathered through self-assessment... more Traditionally, relevance assessments for expert search have been gathered through self-assessment or based on the opinions of coworkers. We introduce three benchmark datasets 1 for expert search that use conference workshops for relevance assessment. Our data sets cover entire research domains as opposed to single institutions. In addition, they provide a larger number of topic-person associations and allow a more objective and fine-grained evaluation of expertise than existing data sets do. We present and discuss baseline results for a language modelling and a topic-centric approach to expert search. We find that the topic-centric approach achieves the best results on domain-specific datasets.
Extracting general or intermediate level terms is a relevant problem that has not received much a... more Extracting general or intermediate level terms is a relevant problem that has not received much attention in literature. Current approaches for term extraction rely on contrastive corpora to identify domainspecific terms, which makes them better suited for specialised terms, that are rarely used outside of the domain. In this work, we propose an alternative measure of domain specificity based on term coherence with an automatically constructed domain model. Although previous systems make use of domain-independent features, their performance varies across domains, while our approach displays a more stable behaviour, with results comparable to, or better than, state-of-the-art methods.
Abstract In this paper, we address the problem of extracting technical terms automatically from a... more Abstract In this paper, we address the problem of extracting technical terms automatically from an unannotated corpus. We introduce a technology term tagger, that is based on Liblinear Support Vector Machines and employs linguistic features including Part of Speech tags and Dependency Structures, in addition to user feedback to perform the task of identification of technology related terms. Our experiments show the applicability of our approach as witnessed by acceptable results on precision and recall.
Enterprise content analysis and platform configuration for enterprise content management is often... more Enterprise content analysis and platform configuration for enterprise content management is often carried out by external consultants that are not necessarily domain experts. In this paper, we propose a set of methods for automatic content analysis that allow users to gain a high level view of the enterprise content. Here, a main concern is the automatic identification of key stakeholders that should ideally be involved in analysis interviews. The proposed approach employs recent advances in term extraction, semantic term grounding, expert
profiling and expert finding in an enterprise content management setting. Extracted terms are evaluated using human judges, while term grounding is evaluated using a manually created gold standard for the DBpedia datasource.
In this paper we present our approach for expertise mining, addressing several research problems ... more In this paper we present our approach for expertise mining, addressing several research problems such as expertise topic extraction, expert finding and expert profiling. We propose a hybrid solution inspired by different research fields such as expert finding, competency management, terminology extraction, keyword extraction and concept extraction. We introduce an expertise benchmarking dataset gathered by exploiting information about workshop committee members and we present Saffron, a system designed to give an overview of research areas and experts and to assist users in finding skilled individuals.
… of the 5th International Workshop on …, Jan 1, 2010
The Semantic Web: Research and Applications, Jan 1, 2010
College of Engineering & Informatics Research Day, …
… of the 2010 ACM Symposium on …, Jan 1, 2010
Uploads
Papers by Georgeta Bordea
profiling and expert finding in an enterprise content management setting. Extracted terms are evaluated using human judges, while term grounding is evaluated using a manually created gold standard for the DBpedia datasource.