Papers by Rafael Pérez muñoz
Lecture Notes in Computer Science, 2005
This paper presents the evaluation of a Temporal QA system for the treatment of temporal complex ... more This paper presents the evaluation of a Temporal QA system for the treatment of temporal complex questions. The system was implemented as a multilayered architecture where complex temporal questions are first decomposed into simple questions, according to the temporal relations expressed in the original question. These simple questions are then processed independently by our standard Question Answering engine and their respective answers are filtered to accomplish each simple question temporal restrictions. Finally, answers to simple decomposed questions are integrated following the temporal relations extracted from the original complex question in order to compose the final answer. This evaluation was performed as a pilot task at the Spanish QA Track from the Cross Language Evaluation Forum 2004.

Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics - ACL '04, 2004
This paper presents a multi-layered Question Answering (Q.A.) architecture suitable for enhancing... more This paper presents a multi-layered Question Answering (Q.A.) architecture suitable for enhancing current Q.A. capabilities with the possibility of processing complex questions. That is, questions whose answer needs to be gathered from pieces of factual information scattered in different documents. Specifically, we have designed a layer oriented to process the different types of temporal questions. Complex temporal questions are first decomposed into simpler ones, according to the temporal relationships expressed in the original question. In the same way, the answers of each simple question are re-composed, fulfilling the temporal restrictions of the original complex question. Using this architecture, a Temporal Q.A. system has been developed. In this paper, we focus on explaining the first part of the process: the decomposition of the complex questions. Furthermore, it has been evaluated with the TERQAS question corpus of 112 temporal questions. For the task of question splitting our system has performed, in terms of precision and recall, 85% and 71%, respectively.

RIV): Probabilistic Relevance Feedback (PRF) y Local Context Analysis (LCA). La principal diferen... more RIV): Probabilistic Relevance Feedback (PRF) y Local Context Analysis (LCA). La principal diferencia observada entre ambos métodos es que mientras PRF utiliza para la expansión las anotaciones correspondientes a las primeras imágenes de un ranking, LCA evita utilizar anotaciones corespondientes a imágenes no relevantes, situadas en esas primeras posiciones, mediante una heurística basada en coocurrencia. Los resultados muestran que LCA obtiene mejor precisión que PRF a medida que la precisión del ranking utilizado para la expansión es menor. Esta observación hace de LCA un método especialmente adecuado para su utilización con rankings de baja precisión como los devueltos por sistemas de RIV basados en el contenido de la imagen. Y así lo demuestran los buenos resultados obtenidos utilizando la variante multimodal de LCA, que es la única estrategia de expansión local que no daña a la diversidad de los resultados, y a su vez la que obtiene nuestros mejores resultados de precisión con el conjunto de consultas de la tarea ImageCLEFPhoto 2008 -4 o MAP y 5 o P20 de las 1039 ejecuciones automáticas enviadas por los participantes -.

Lecture Notes in Computer Science, 2010
In our participation in the ImageCLEF 2009 Photo Retrieval task we pursued two objectives: Firstl... more In our participation in the ImageCLEF 2009 Photo Retrieval task we pursued two objectives: Firstly, to re-evaluate MultiModal Local Context Analysis (MMLCA), our multimodal fusion technique. Secondly, to evaluate a new subquery generation technique based on clustering. From the experiments conducted: Firstly, we confirmed MMLCA performs better for generic domain collections than the other local expansion techniques evaluated. Secondly, our proposal of subquery generation based on clustering, obtained good results (5th best textual run of the Photo Retrieval task). Besides these results in this paper we try to reflect on the extent to which reordering techniques are appropriate to promote diversity. Results suggest that while reordering strategies limit the margin of improvement due to their use of a limited number of documents, the use of approaches based on subqueries generation can overcome this limitation.

Lecture Notes in Computer Science, 2009
This paper describes our participation in the Robust WSD Task within the CLEF 2008. The aim of th... more This paper describes our participation in the Robust WSD Task within the CLEF 2008. The aim of this pilot task is exploring methods which can take profit of WSD information in order to improve the IR systems. In our approach we have used a passage based system jointly with a WordNet based expansion method for the collection documents and the queries using the two WSD systems runs provided by the organization. Furthermore we have experimented with two well known relevance feedback methods -LCA and PRF -, in order to figure out which is more suitable to take profit of the WSD query expansion based on Wordnet. Our best run has obtained a 4th place in the competition with a value of 0.4008 MAP. We conclude that LCA fits better than PRF to this task. And that our WSD expansion is useful for some query subsets. In future works we will study the features of the query subsets for which the performance of our system decreases.

Proceedings of the Workshop on Annotating and Reasoning about Time and Events - ARTE '06, 2006
The extension to new languages is a well known bottleneck for rule-based systems. Considerable hu... more The extension to new languages is a well known bottleneck for rule-based systems. Considerable human effort, which typically consists in re-writing from scratch huge amounts of rules, is in fact required to transfer the knowledge available to the system from one language to a new one. Provided sufficient annotated data, machine learning algorithms allow to minimize the costs of such knowledge transfer but, up to date, proved to be ineffective for some specific tasks. Among these, the recognition and normalization of temporal expressions still remains out of their reach. Focusing on this task, and still adhering to the rule-based framework, this paper presents a bunch of experiments on the automatic porting to Italian of a system originally developed for Spanish. Different automatic rule translation strategies are evaluated and discussed, providing a comprehensive overview of the challenge.

This paper presents the automatic extension of TERSEO to other languages, a knowledge-based syste... more This paper presents the automatic extension of TERSEO to other languages, a knowledge-based system for the recognition and normalization of temporal expressions, originally developed for Spanish. TERSEO was extended to English and Italian through the automatic translation of the temporal expressions, and it was presented in previous works (see Saquete et al. (2004a)), but a new methodology has been designed with the purpose of obtaining better results in this issue. This new methodology is based on the use of parallel corpora for extending the TERSEO temporal model to other languages. In this case, two different methods have been tested: (1) automatic translation of TERSEO patterns to other languages and (2) automatic corpora annotation in the target side of parallel corpora. The main idea is focused on annotating the Spanish side of a parallel corpora, projecting the analysis to the second language, and then obtaining new TERSEO patterns (1) and new annotated corpus (2). The set of new patterns will be used to improve the current TERSEO language independent modules. Whereas the new annotated corpus will be used to train a ML system. This system will annotate new temporal expressions in the new language.

In this paper we have focused our efforts on comparing the behaviour of two relevance feedback me... more In this paper we have focused our efforts on comparing the behaviour of two relevance feedback methods in this task - LCA and PRF - and in checking if our passage based information rerieval (IR) system is useful in a competition with small sized documents. Furthermore we have added an adaptation to this domain based on decompound in single terms those file names which use a Camel Case notation. We base our decision on the belief that the most meaningful information of an image file appointed by a human is on the file name itself. Thus, it is important to make visible this terms when they are hidden in a compounded file name. Finally we have added a geographical query expansion and a visual concept expansion. We have obtained a 29th place within a total of 77 runs with our baseline run - which only used the passage IR system -, and a 3rd place obtained with our best run - which used the passage IR system with Camel Case decompounding -. It shows us on one hand the usefulness of our p...

Data & Knowledge Engineering, 2006
In this paper, a method of event ordering based on temporal information resolution is presented. ... more In this paper, a method of event ordering based on temporal information resolution is presented. This method consists of two main steps: on the one hand, the recognition and resolution of the temporal expressions that can be transformed on a date, and therefore these dates establish an order between the events that contain them. On the other hand, the detection of temporal signals, for example after, that can not be transformed on a concrete date but relate two events in a chronological way. This event ordering method can be applied to Natural Language Processing systems, for example: Summarization, Question Answering, etc. It is important to emphasize that the event ordering method is based on a multilingual temporal information resolution system. Moreover, this multilinguality has been automatically obtained from a monolingual system (Spanish). The evaluation of the multilingual system is also shown in this paper, achieving a precision of 88% for Spanish and 77% for English.
Lecture Notes in Computer Science, 2006
This paper describes the participation of IR-n system at CLEF-2005. This year, we have participat... more This paper describes the participation of IR-n system at CLEF-2005. This year, we have participated in bilingual task (English-French and English-Portuguese) and multilingual task (English, French, Italian, German, Dutch, Finish and Swedish). At present conference, we have introduced the combined passages method for the bilingual task. Futhermore we have applied the method of logic forms in the same task. For the multilingual task we have had a participation University of Alicante and University of Jaen together. We want to emphasize the good score achieved in bilingual task improving a 45% the average.
Proyecto de Mejora de acceso a Camino del Cementerio, en Antequera (Málaga) MEMORIA 21 13. EJECUC... more Proyecto de Mejora de acceso a Camino del Cementerio, en Antequera (Málaga) MEMORIA 21 13. EJECUCIÓN DE LA OBRA La Dirección Facultativa de la Obra podrá determinar que esta se ejecute en fines de semana o en horario nocturno, sin que ello pueda servir de justificación al contratista, para cualquier clase de reclamación económica. Por el plazo de ejecución previsto para la realización de las obras, según se especifica en el apartado Nº 8 de esta memoria, no se estima necesaria la revisión de precios.

Journal of Artificial Intelligence Research, 2009
This paper presents a multilayered architecture that enhances the capabilities of current QA syst... more This paper presents a multilayered architecture that enhances the capabilities of current QA systems and allows different types of complex questions or queries to be processed. The answers to these questions need to be gathered from factual information scattered throughout different documents. Specifically, we designed a specialized layer to process the different types of temporal questions. Complex temporal questions are first decomposed into simple questions, according to the temporal relations expressed in the original question. In the same way, the answers to the resulting simple questions are recomposed, fulfilling the temporal restrictions of the original complex question. A novel aspect of this approach resides in the decomposition which uses a minimal quantity of resources, with the final aim of obtaining a portable platform that is easily extensible to other languages. In this paper we also present a methodology for evaluation of the decomposition of the questions as well a...

Expert Systems with Applications, 2015
The Web 2.0 has resulted in a shift as to how users consume and interact with the information, an... more The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users comunicate, exchange, and share opinions. The explotation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of stateof-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.

Lecture Notes in Computer Science, 2007
The Answer Validation Exercise (AVE) is a pilot track within the Cross-Language Evaluation Forum ... more The Answer Validation Exercise (AVE) is a pilot track within the Cross-Language Evaluation Forum (CLEF) 2006. The AVE competition provides an evaluation framework for answer validations in Question Answering (QA). In our participation in AVE, we propose a system that has been initially used for other task as Recognising Textual Entailment (RTE). The aim of our participation is to evaluate the improvement our system brings to QA. Moreover, due to the fact that these two task (AVE and RTE) have the same main idea, which is to find semantic implications between two fragments of text, our system has been able to be directly applied to the AVE competition. Our system is based on the representation of the texts by means of logic forms and the computation of semantic comparison between them. This comparison is carried out using two different approaches. The first one managed by a deeper study of the Word-Net relations, and the second uses the measure defined by Lin in order to compute the semantic similarity between the logic form predicates. Moreover, we have also designed a voting strategy between our system and the MLEnt system, also presented by the University of Alicante, with the aim of obtaining a joint execution of the two systems developed at the University of Alicante. Although the results obtained have not been very high, we consider that they are quite promising and this supports the fact that there is still a lot of work on researching in any kind of textual entailment.
Resumen: En este artículo presentamos nuestra contribución a la Tarea 1 (clasificación de polarid... more Resumen: En este artículo presentamos nuestra contribución a la Tarea 1 (clasificación de polaridad en 6 niveles) de la competición TASS 2013. Esta contribución está formada por dos aproximaciones diferentes: una versión modificada de un algoritmo de ranking (RA-SR) utilizando bigramas, y una nueva propuesta que utiliza un puntuador de skipgrams. Estas aproximaciones crean diccionarios de sentimientos capaces de mantener el contexto de los términos. Todas nuestras aproximaciones aparecen en los primeros 10 mejores resultados entre los sistemas presentados a la competición, y la combinación de ambos consigue llegar a la primera posición. Palabras clave: análisis de sentimientos, minería de opiniones, generación de lexicones, aprendizaje automático, twitter, algoritmo de ranking, skipgrams
Medicina y Seguridad del Trabajo, 2013
MEDICINA y SEGURIDAD del trabajo Es primordial reseñar que uno de los inconvenientes más importan... more MEDICINA y SEGURIDAD del trabajo Es primordial reseñar que uno de los inconvenientes más importantes encontrados ha sido la falta de codificación de este diagnóstico en sus diferentes subtipos diagnósticos, lo que puede determinar un déficit de información clave para la correcta adscripción diagnóstica de los pacientes y posterior tratamiento, repercutiendo en la evolución y duración de la Incapacidad Temporal y el absentismo laboral.
Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing - RTE '07, 2007
The textual entailment recognition system that we discuss in this paper represents a perspective-... more The textual entailment recognition system that we discuss in this paper represents a perspective-based approach composed of two modules that analyze text-hypothesis pairs from a strictly lexical and syntactic perspectives, respectively. We attempt to prove that the textual entailment recognition task can be overcome by performing individual analysis that acknowledges us of the maximum amount of information that each single perspective can provide. We compare this approach with the system we presented in the previous edition of PASCAL Recognising Textual Entailment Challenge, obtaining an accuracy rate 17.98% higher.

Lecture Notes in Computer Science
This paper discusses the recognition of textual entailment in a text-hypothesis pair by applying ... more This paper discusses the recognition of textual entailment in a text-hypothesis pair by applying a wide variety of lexical measures. We consider that the entailment phenomenon can be tackled from three general levels: lexical, syntactic and semantic. The main goals of this research are to deal with this phenomenon from a lexical point of view, and achieve high results considering only such kind of knowledge. To accomplish this, the information provided by the lexical measures is used as a set of features for a Support Vector Machine which will decide if the entailment relation is produced. A study of the most relevant features and a comparison with the best state-of-the-art textual entailment systems is exposed throughout the paper. Finally, the system has been evaluated using the Second PASCAL Recognising Textual Entailment Challenge data and evaluation methodology, obtaining an accuracy rate of 61.88%.

Diálogos, 2013
fue uno de los pensadores colombianos más destacados del siglo XX. Aunque el año pasado se cumpli... more fue uno de los pensadores colombianos más destacados del siglo XX. Aunque el año pasado se cumplieron 150 años de su nacimiento, todavía su obra se halla dispersa y no se conocen muchas de sus facetas intelectuales. Este ensayo busca construir el perfil intelectual del antioqueño, fijando su mirada en el ángulo humanista en el siglo XX. Con fuentes inusitadas, se trazan tres registros de la opinión y el pensamiento del crítico colombiano: A-El de sus convicciones periodísticas y su labor en la prensa; B-El análisis de la modernidad y C-Las indagaciones sobre el impacto del proceso de masificación. De igual manera se explora uno de los campos menos conocidos, el del análisis sociológico y político. En el escrito se trazan reflexiones sobre el contexto político y el marco histórico en que se ubica la producción creativa de Sanín Cano, en específico sus escritos en el diario La Nación de Buenos Aires y en el periódico El Tiempo de Bogotá.
Crisis, dictaduras, democracia: I …, 2008
Cuando se examina el proceso de transición a la democracia iniciado tras la muerte del general Fr... more Cuando se examina el proceso de transición a la democracia iniciado tras la muerte del general Franco, sobre todo cuando se hace a través de los textos divulgativos publicados por los medios de comunicación, encontramos algunas ideas que han consolidado una ...
Uploads
Papers by Rafael Pérez muñoz