Author profiling consists of predicting some author's characteristics (e.g. age, gender, personal... more Author profiling consists of predicting some author's characteristics (e.g. age, gender, personality) from her writing. After addressing at PAN@CLEF mainly age and gender identification, and also personality recognition in Twitter 1 , in this PAN@FIRE track on Personality Recognition from SOurce COde (PR-SOCO) we have addressed the problem of predicting author's personality traits from her source code. In this paper, we analyse 48 runs sent by 11 participant teams. Given a set of source codes written in Java by students who answered also a personality test, participants had to predict personality traits, based on the big five model. Results have been evaluated with two complementary measures (RMSE and Pearson product-moment correlation) that have permitted to identify whether systems with low error rates may work due to random chance. No matter the approach, openness to experience is the trait where the participants obtained the best results for both measures.
Passage retrieval is an important stage of question answering systems. Closed domain passage retr... more Passage retrieval is an important stage of question answering systems. Closed domain passage retrieval, e.g. biomedical passage retrieval presents additional challenges such as specialized terminology, more complex and elaborated queries, scarcity in the amount of available data, among others. However, closed domains also offer some advantages such as the availability of specialized structured information sources, e.g. ontologies and thesauri, that could be used to improve retrieval performance. This paper presents a novel approach for biomedical passage retrieval which is able to combine different information sources using a similarity matrix fusion strategy based on a convolutional neural network architecture. The method was evaluated over the standard BioASQ dataset, a dataset specialized on biomedical question answering. The results show that the method is an effective strategy for biomedical passage retrieval able to outperform other state-of-the-art methods in this domain.
This paper presents a real-valued negative selection algorithm with good mathematical foundation ... more This paper presents a real-valued negative selection algorithm with good mathematical foundation that solves some of the drawbacks of our previous approach [11]. Specifically, it can produce a good estimate of the optimal number of detectors needed to cover the non-self space, and the maximization of the non-self coverage is done through an optimization algorithm with proven convergence properties. The proposed method is a randomized algorithm based on Monte Carlo methods. Experiments are performed to validate the assumptions made while designing the algorithm and to evaluate its performance. 3
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012
In this paper, we propose a method to build an index for image search using multimodal informatio... more In this paper, we propose a method to build an index for image search using multimodal information, that is, using visual features and text data simultaneously. The method combines both data sources and generates one multimodal representation using latent factor analysis and matrix factorization. One remarkable characteristic of this multimodal representation is that it connects textual and visual content allowing to solve queries with only visual content by implicitly completing the missing textual content. Another important characteristic of the method is that the multimodal representation is learned online using an efficient stochastic gradient descent formulation. Experiments were conducted in a dataset of 5,000 images to evaluate the convergence speed and search performance. Experimental results show that the proposed algorithm requires only one pass through the data set to achieve high quality retrieval performance.
Negative selection algorithm is one of the most widely used techniques in the field of artificial... more Negative selection algorithm is one of the most widely used techniques in the field of artificial immune systems. It is primarily used to detect changes in data/behavior patterns by generating detectors in the complementary space (from given normal samples). The negative selection algorithm generally uses binary matching rules to generate detectors. The purpose of the paper is to show that the low-level representation of binary matching rules is unable to capture the structure of some problem spaces. The paper compares some of the binary matching rules reported in the literature and study how they behave in a simple two-dimensional real-valued space. In particular, we study the detection accuracy and the areas covered by sets of detectors generated using the negative selection algorithm. 3 3 Draft Version.
In this paper, we provided an extension of our previous work on adaptive genetic algorithm [1]. E... more In this paper, we provided an extension of our previous work on adaptive genetic algorithm [1]. Each individual encodes the probability (rate) of its genetic operators. In every generation, each individual is modified by only one operator. This operator is selected according to its encoded rates. The rates are updated according to the performance achieved by the offspring (compared to its parents) and a random learning rate. The proposed approach is augmented with a simple transposition operator and tested on a number of benchmark functions.
First International Conference on Artificial Immune Systems, Sep 9, 2002
The purpose of this work is to investigate a hybrid approach (neuro-immune technique) for anomaly... more The purpose of this work is to investigate a hybrid approach (neuro-immune technique) for anomaly detection on time series data. In many anomaly detection applications, only positive (normal) samples are available for training purpose. However, conventional classification algorithms need both positive and negative samples. The proposed approach uses normal samples to generate abnormal samples that are subsequently used as training data for a neural network. The approach is compared ...
Uploads
Papers by Fabio González