Papers by Asli Celikyilmaz
We describe a joint model for understanding user actions in natural language utterances. Our mult... more We describe a joint model for understanding user actions in natural language utterances. Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance's target domain (e.g. movies), intention (e.g., finding a movie) along with other semantic units (e.g., movie name). We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model. Using utterances from five domains, our approach shows up to 4.5% improvement on domain and dialog act performance over cascaded approach in which each semantic component is learned sequentially and a supervised joint learning model (which requires fully labeled data).

We present a graph-based semi-supervised learning for the question-answering (QA) task for rankin... more We present a graph-based semi-supervised learning for the question-answering (QA) task for ranking candidate sentences. Using textual entailment analysis, we obtain entailment scores between a natural language question posed by the user and the candidate sentences returned from search engine. The textual entailment between two sentences is assessed via features representing high-level attributes of the entailment problem such as sentence structure matching, question-type named-entity matching based on a question-classifier, etc. We implement a semi-supervised learning (SSL) approach to demonstrate that utilization of more unlabeled data points can improve the answer-ranking task of QA. We create a graph for labeled and unlabeled data using match-scores of textual entailment features as similarity weights between data points. We apply a summarization method on the graph to make the computations feasible on large datasets. With a new representation of graph-based SSL on QA datasets using only a handful of features, and under limited amounts of labeled data, we show improvement in generalization performance over state-of-the-art QA models.
There have been considerable attempts to incorporate semantic knowledge into coreference resoluti... more There have been considerable attempts to incorporate semantic knowledge into coreference resolution systems: different knowledge sources such as WordNet and Wikipedia have been used to boost the performance. In this paper, we propose new ways to extract WordNet feature. This feature, along with other features such as named entity feature, can be used to build an accurate semantic class (SC) classifier. In addition, we analyze the SC classification errors and propose to use relaxed SC agreement features. The proposed accurate SC classifier and the relaxation of SC agreement features on ACE2 coreference evaluation can boost our baseline system by 10.4% and 9.7% using MUC score and anaphor accuracy respectively.

Spoken language understanding (SLU) is one of the main tasks of a dialog system, aiming to identi... more Spoken language understanding (SLU) is one of the main tasks of a dialog system, aiming to identify semantic components in user utterances. In this paper, we investigate the incorporation of context into the SLU tasks of intent prediction and slot detection. Using a corpus that contains session-level information, including the start and end of a session and the sequence of utterances within it, we experiment with the incorporation of information from previous intra-session utterances into the SLU tasks on a given utterance. For slot detection, we find that including features indicating the slots appearing in the previous utterances gives no significant increase in performance. In contrast, for intent prediction we find that a similar approach that incorporates the intent of the previous utterance as a feature yields relative error rate reductions of 6.7% on transcribed data and 8.7% on automatically-recognized data. We also find similar gains when treating intent prediction of utterance sequences as a sequential tagging problem via SVM-HMMs.

We present a generative model for conversational dialogues, namely the actortopic model (ACTM), t... more We present a generative model for conversational dialogues, namely the actortopic model (ACTM), that extend the author-topic model (Rosen-Zvi, et.al, 2004) to identify actors of given conversation in literary narratives. Thus ACTM assigns each instance of quoted speech to an appropriate character. We model dialogues in a literary text, which take place between two or more actors conversing on different topics, as distributions over topics, which are also mixtures of the term distributions associated with multiple actors. This follows the linguistic intuition that rich contextual information can be useful in understanding dialogues, eventually effecting the social network construction. We propose ACTM to ideally lead our research on social network extraction in literary narratives. Our experiments on nineteenth century English novels indicate that exploiting content structure of dialogues can yield significant improvements over a baseline using language models which is based on local context in constructing social interactions.

Finding concepts in natural language utterances is a challenging task, especially given the scarc... more Finding concepts in natural language utterances is a challenging task, especially given the scarcity of labeled data for learning semantic ambiguity. Furthermore, data mismatch issues, which arise when the expected test (target) data does not exactly match the training data, aggravate this scarcity problem. To deal with these issues, we describe an efficient semisupervised learning (SSL) approach which has two components: (i) Markov Topic Regression is a new probabilistic model to cluster words into semantic tags (concepts). It can efficiently handle semantic ambiguity by extending standard topic models with two new features. First, it encodes word n-gram features from labeled source and unlabeled target data. Second, by going beyond a bag-of-words approach, it takes into account the inherent sequential nature of utterances to learn semantic classes based on context. (ii) Retrospective Learner is a new learning technique that adapts to the unlabeled target data. Our new SSL approach improves semantic tagging performance by 3% absolute over the baseline models, and also compares favorably on semi-supervised syntactic tagging.

State-of-the art spoken language understanding models that automatically capture user intents in ... more State-of-the art spoken language understanding models that automatically capture user intents in human to machine dialogs are trained with manually annotated data, which is cumbersome and time-consuming to prepare. For bootstrapping the learning algorithm that detects relations in natural language queries to a conversational system, one can rely on publicly available knowledge graphs, such as Freebase, and mine corresponding data from the web. In this paper, we present an unsupervised approach to discover new user intents using a novel Bayesian hierarchical graphical model. Our model employs search query click logs to enrich the information extracted from bootstrapped models. We use the clicked URLs as implicit supervision and extend the knowledge graph based on the relational information discovered from this model. The posteriors from the graphical model relate the newly discovered intents with the search queries. These queries are then used as additional training examples to complement the bootstrapped relation detection models. The experimental results demonstrate the effectiveness of this approach, showing extended coverage to new intents without impacting the known intents.

Language can describe our visual world at many levels, including not only what is literally there... more Language can describe our visual world at many levels, including not only what is literally there but also the sentiment that it invokes. In this paper, we study visual language, both literal and sentimental, that describes the overall appearance and style of virtual characters. Sentimental properties, including labels such as "youthful" or "country western," must be inferred from descriptions of the more literal properties, such as facial features and clothing selection. We present a new dataset, collected to describe Xbox avatars, as well as models for learning the relationships between these avatars and their literal and sentimental descriptions. In a series of experiments, we demonstrate that such learned models can be used for a range of tasks, including predicting sentimental words and using them to rank and build avatars. Together, these results demonstrate that sentimental language provides a concise (though noisy) means of specifying low-level visual properties.
Semi Supervised Semantic Tagging for Conversational Understanding using Markov Topic Regression
Abstract We describe a joint model for understanding user actions in natural language utterances.... more Abstract We describe a joint model for understanding user actions in natural language utterances. Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance's target domain (eg movies), intention (eg, finding a movie) along with other semantic units (eg, movie name). We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model.
Abstract There have been considerable attempts to incorporate semantic knowledge into coreference... more Abstract There have been considerable attempts to incorporate semantic knowledge into coreference resolution systems: different knowledge sources such as WordNet and Wikipedia have been used to boost the performance. In this paper, we propose new ways to extract WordNet feature. This feature, along with other features such as named entity feature, can be used to build an accurate semantic class (SC) classifier. In addition, we analyze the SC classification errors and propose to use relaxed SC agreement features.
Abstract—Graph-based semi-supervised learning has recently emerged as a promising approach to dat... more Abstract—Graph-based semi-supervised learning has recently emerged as a promising approach to data-sparse learning problems in natural language processing. They rely on graphs that jointly represent each data point. The problem of how to best formulate the graph representation remains an open research topic. In this paper, we introduce a type-2 fuzzy arithmetic to characterize the edge weights of a formed graph as type-2 fuzzy numbers.
Abstract In this paper, we present a novel approach to exploit user queries mined from search eng... more Abstract In this paper, we present a novel approach to exploit user queries mined from search engine query click logs to bootstrap or improve slot filling models for spoken language understanding. We propose extending the earlier gazetteer population techniques to mine unannotated training data for semantic parsing. The automatically annotated mined data can then be used to train slot specific parsing models.
Abstract Fuzzy inference systems based on fuzzy rule bases (FRBs) have been successfully used to ... more Abstract Fuzzy inference systems based on fuzzy rule bases (FRBs) have been successfully used to model real problems. Some of the limitations exhibited by these traditional fuzzy inference systems are that there is an abundance of fuzzy operations and operators that an expert should identify. In this paper we present an alternate learning and reasoning schema, which use fuzzy functions instead of if… then rule base structures.
Abstract This paper presents a type-2 genetic fuzzy inference system based on fuzzy c-regression ... more Abstract This paper presents a type-2 genetic fuzzy inference system based on fuzzy c-regression method clustering algorithm, to identify uncertainties in hyperplane shaped fuzzy clusters. The uncertainty in learning parameters of the new system is identified by type-2 fuzzy sets. Genetic algorithm is used to optimize the secondary membership grades of the type-2 fuzzy sets.
Abstract “Fuzzy Functions” are proposed to be determined separately by two regression estimation ... more Abstract “Fuzzy Functions” are proposed to be determined separately by two regression estimation models: the least squares estimation (LSE), and Support Vector Machines for Regression (SVR), techniques for the development of fuzzy system models. LSE model tries to estimate the fuzzy function parameters linearly in the original space, whereas SVR algorithm maps the data samples into higher dimensional feature space and estimates a linear fuzzy function in the feature space.
The objective of this book is to present an uncertainty modeling approach using a new type of fuz... more The objective of this book is to present an uncertainty modeling approach using a new type of fuzzy system model via" Fuzzy Functions". Since most researchers on fuzzy systems are more familiar with the standard fuzzy rule bases and their inference system structures, many standard tools of fuzzy system modeling approaches are reviewed to demonstrate the novelty of the structurally different fuzzy functions, before we introduced the new methodologies.
Proceedings of the NAACL HLT 2010 Workshop on Semantic Search}
Proceedings of the …, 2010
@Book{SEMANTIC:2010, editor = {Donghui Feng and Jamie Callan and Eduard Hovy and Marius Pasca}, t... more @Book{SEMANTIC:2010, editor = {Donghui Feng and Jamie Callan and Eduard Hovy and Marius Pasca}, title = {Proceedings of the NAACL HLT 2010 Workshop on Semantic Search}, month = {June}, year = {2010}, address = {Los Angeles, California}, publisher = {Association for Computational Linguistics}, url = {http://www.aclweb.org/anthology/W10-12} } @InProceedings{ celikyilmaz-hakkanitur-tur:2010:SEMANTIC, author = {Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan}, title = {LDA Based Similarity Modeling for Question Answering ...

research.microsoft.com
In natural language human-machine statistical dialog systems, semantic interpretation is a key ta... more In natural language human-machine statistical dialog systems, semantic interpretation is a key task typically performed following semantic parsing, and aims to extract canonical meaning representations of semantic components. In the literature, usually manually built rules are used for this task, even for implicitly mentioned nonnamed semantic components (like genre of a movie or price range of a restaurant). In this study, we present statistical methods for modeling interpretation, which can also benefit from semantic features extracted from large in-domain knowledge sources. We extract features from user utterances using a semantic parser and additional semantic features from textual sources (online reviews, synopses, etc.) using a novel tree clustering approach, to represent unstructured information that correspond to implicit semantic components related to targeted slots in the user's utterances. We evaluate our models on a virtual personal assistance system and demonstrate that our interpreter is effective in that it does not only improve the utterance interpretation in spoken dialog systems (reducing the interpretation error rate by 36% relative compared to a language model baseline), but also unveils hidden semantic units that are otherwise nearly impossible to extract from purely manual lexical features that are typically used in utterance interpretation.
Uploads
Papers by Asli Celikyilmaz