Skip to main content

Farhana Liza

Follower

1

Following

2

Public Views

Interests

Uploads

Papers by Farhana Liza

Gender bias in transformers: A comprehensive review of detection and mitigation strategies

Natural Language Processing Journal, Feb 29, 2024

Gender Bias in Transformer Models: A comprehensive survey

arXiv (Cornell University), Jun 18, 2023

Gender bias in artificial intelligence (AI) has emerged as a pressing concern with profound impli... more Gender bias in artificial intelligence (AI) has emerged as a pressing concern with profound implications for individuals' lives. This paper presents a comprehensive survey that explores gender bias in Transformer models from a linguistic perspective. While the existence of gender bias in language models has been acknowledged in previous studies, there remains a lack of consensus on how to effectively measure and evaluate this bias. Our survey critically examines the existing literature on gender bias in Transformers, shedding light on the diverse methodologies and metrics employed to assess bias. Several limitations in current approaches to measuring gender bias in Transformers are identified, encompassing the utilization of incomplete or flawed metrics, inadequate dataset sizes, and a dearth of standardization in evaluation methods. Furthermore, our survey delves into the potential ramifications of gender bias in Transformers for downstream applications, including dialogue systems and machine translation. We underscore the importance of fostering equity and fairness in these systems by emphasizing the need for heightened awareness and accountability in developing and deploying language technologies. This paper serves as a comprehensive overview of gender bias in Transformer models, providing novel insights and offering valuable directions for future research in this critical domain.

Sentence Classification with Imbalanced Data for Health Applications

Identifying and extracting reports of medications, their abuse or adverse effects from social med... more Identifying and extracting reports of medications, their abuse or adverse effects from social media is a challenging task. In social media, relevant reports are very infrequent, causes imbalanced class distribution for machine learning algorithms. Learning algorithms typically designed to optimize the overall accuracy without considering the relative distribution of each class. Thus, imbalanced class distribution is problematic as learning algorithms have low predictive accuracy for the infrequent class. Moreover, social media represents natural linguistic variation in creative language expressions. In this paper, we have used a combination of data balancing and neural language representation techniques to address the challenges. Specifically, we participated the shared tasks 1, 2 (all languages), 4, and 3 (only the span detection, no normalization was attempted) in Social Media Mining for Health applications (SMM4H) 2020 (Klein et al., 2020). The results show that with the proposed...

Multilingual Protest News Detection - Shared Task 1, CASE 2021

Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

Benchmarking state-of-the-art text classification and information extraction systems in multiling... more Benchmarking state-of-the-art text classification and information extraction systems in multilingual, cross-lingual, few-shot, and zeroshot settings for socio-political event information collection is achieved in the scope of the shared task Socio-political and Crisis Events Detection at the workshop CASE @ ACL-IJCNLP 2021. Socio-political event data is utilized for national and international policyand decision-making. Therefore, the reliability and validity of such datasets are of utmost importance. We split the shared task into three parts to address the three aspects of data collection (Task 1), fine-grained semantic classification (Task 2), and evaluation (Task 3). Task 1, which is the focus of this report, is on multilingual protest news detection and comprises four subtasks that are document classification (subtask 1), sentence classification (subtask 2), event sentence coreference identification (subtask 3), and event extraction (subtask 4). All subtasks have English, Portuguese, and Spanish for both training and evaluation data. Data in Hindi language is available only for the evaluation of subtask 1. The majority of the submissions, which are 238 in total, are created using multi-and cross-lingual approaches. Best scores are between 77.27 and 84.55 F1-macro for subtask 1, between 85.32 and 88.61 F1macro for subtask 2, between 84.23 and 93.03 CoNLL 2012 average score for subtask 3, and between 66.20 and 78.11 F1-macro for subtask 4 in all evaluation settings. The performance of the best system for subtask 4 is above 66.20 F1 for all available languages. Although there is still a significant room for improvement in cross-lingual and zero-shot settings, the best submissions for each evaluation scenario yield remarkable results. Monolingual models outperformed the multilingual models in a few evaluation scenarios, in which there is relatively much training data.

Event Causality Identification with Causal News Corpus -- Shared Task 3, CASE 2022

arXiv (Cornell University), Nov 22, 2022

The Event Causality Identification Shared Task of CASE 2022 involved two subtasks working on the ... more The Event Causality Identification Shared Task of CASE 2022 involved two subtasks working on the Causal News Corpus. Subtask 1 required participants to predict if a sentence contains a causal relation or not. This is a supervised binary classification task. Subtask 2 required participants to identify the Cause, Effect and Signal spans per causal sentence. This could be seen as a supervised sequence labeling task. For both subtasks, participants uploaded their predictions for a held-out test set, and ranking was done based on binary F1 and macro F1 scores for Subtask 1 and 2, respectively. This paper summarizes the work of the 17 teams that submitted their results to our competition and 12 system description papers that were received. The best F1 scores achieved for Subtask 1 and 2 were 86.19% and 54.15%, respectively. All the topperforming approaches involved pre-trained language models fine-tuned to the targeted task. We further discuss these approaches and analyze errors across participants' systems in this paper.

Improving Language Modelling with Noise Contrastive Estimation

Proceedings of the AAAI Conference on Artificial Intelligence

Neural language models do not scale well when the vocabulary is large. Noise contrastive estimati... more Neural language models do not scale well when the vocabulary is large. Noise contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, its full potential has not been demonstrated in the language modelling literature. A sufficient investigation of the hyperparameters in the NCE-based neural language models was clearly missing. In this paper, we showed that NCE can be a very successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the `search-then-converge' learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. Using a popular benchmark, we showed that appropriate tuning of NCE in neural language models ...

The Causal News Corpus: Annotating Causal Relations in Event Sentences from News

arXiv (Cornell University), Apr 25, 2022

Despite the importance of understanding causality, corpora addressing causal relations are limite... more Despite the importance of understanding causality, corpora addressing causal relations are limited. There is a discrepancy between existing annotation guidelines of event causality and conventional causality corpora that focus more on linguistics. Many guidelines restrict themselves to include only explicit relations or clause-based arguments. Therefore, we propose an annotation schema for event causality that addresses these concerns. We annotated 3,559 event sentences from protest event news with labels on whether it contains causal relations or not. Our corpus is known as the Causal News Corpus (CNC). A neural network built upon a state-of-the-art pre-trained language model performed well with 81.20% F1 score on test set, and 83.46% in 5-folds cross-validation. CNC is transferable across two external corpora: CausalTimeBank (CTB) and Penn Discourse Treebank (PDTB). Leveraging each of these external datasets for training, we achieved up to approximately 64% F1 on the CNC test set without additional fine-tuning. CNC also served as an effective training and pre-training dataset for the two external corpora. Lastly, we demonstrate the difficulty of our task to the layman in a crowd-sourced annotation exercise. Our annotated corpus is publicly available, providing a valuable resource for causal text mining researchers.

Improving training of deep neural network sequence models

University of Kent, 2019

Sequence models, in particular, language models are fundamental building blocks of downstream app... more Sequence models, in particular, language models are fundamental building blocks of downstream applications including speech recognition, speech synthesis, information retrieval, machine translation, and question answering systems. Neural network language models are effective in generalising (i.e. perform efficiently with the data sparsity problem) compared to traditional N-grams models. However, neural network language models have several fundamental problems-the training of neural network language models is computationally inefficient and analysing the trained models is difficult. In this thesis, improvement techniques to reduce the computational complexity and an extensive analysis of the learned models are presented. During my PhD study, I have received great support from the information services, school of computing's administrative staff members and IT technical support. I want to thank Angela Doe (ret.), Amanda Ollier, Sonnary Dearden, Julie Teulings and Angie Allen for their great administrative support. I want to thank the Graduate School for providing the skill trainings, those were really useful. I iii iv would like to thank other PhD students, especially Fabio Fabris and Caroline Rizzi Raymundo for their friendly interaction, which made the PhD study less isolated. I would also like to thank Lee Harris for his proofread of my papers. The PhD thesis has improved in quality by the critical reviews from the anonymous reviewers and I would like to thank them all, I also thank SPiCe competition organiser for setting the challenging datasets and organising such a researchoriented competition. I must also thank the members of my supervisory panel: Professor Sally Fincher and Dr. Colin Johnson, who have also contributed to my research with insightful comments. viii

Variation in the timing of Covid-19 communication across universities in the UK

PLOS ONE, 2021

During the Covid-19 pandemic, universities in the UK used social media to raise awareness and pro... more During the Covid-19 pandemic, universities in the UK used social media to raise awareness and provide guidance and advice about the disease to students and staff. We explain why some universities used social media to communicate with stakeholders sooner than others. To do so, we identified the date of the first Covid-19 related tweet posted by each university in the country and used survival models to estimate the effect of university-specific characteristics on the timing of these messages. In order to confirm our results, we supplemented our analysis with a study of the introduction of coronavirus-related university webpages. We find that universities with large numbers of students are more likely to use social media and the web to speak about the pandemic sooner than institutions with fewer students. Universities with large financial resources are also more likely to tweet sooner, but they do not introduce Covid-19 webpages faster than other universities. We also find evidence of...

Semantic Journeys: Quantifying Change in Emoji Meaning from 2012-2018

ArXiv, 2021

The semantics of emoji has, to date, been considered from a static perspective. We offer the firs... more The semantics of emoji has, to date, been considered from a static perspective. We offer the first longitudinal study of how emoji semantics changes over time, applying techniques from computational linguistics to six years of Twitter data. We identify five patterns in emoji semantic development and find evidence that the less abstract an emoji is, the more likely it is to undergo semantic change. In addition, we analyse select emoji in more detail, examining the effect of seasonality and world events on emoji semantics. To aid future work on emoji and semantics, we make our data publicly available along with a web-based interface that anyone can use to explore semantic change in emoji.

C L ] 2 2 Se p 20 17 Improving Language Modelling with Noise Contrastive Estimation

Neural language models do not scale well when the vocabulary is large. Noise contrastive estimati... more Neural language models do not scale well when the vocabulary is large. Noise contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, it was considered to be an unsuccessful approach for language modelling. A sufficient investigation of the hyperparameters in the NCE-based neural language models was also missing. In this paper, we showed that NCE can be a successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the ‘search-thenconverge’ learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. We showed that appropriate tuning of NCE-based neural language models outperforms the state-of-the-art single-model m...

A Spectral Method that Worked Well in the SPiCe'16 Competition

We present methods used in our submission to the Sequence Prediction ChallengE (SPiCe’16). The tw... more

Capturing Changes in Mood Over Time in Longitudinal Data Using Ensemble Methodologies

Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology

Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Word embeddings are increasingly used for the automatic detection of semantic change; yet, a robu... more Word embeddings are increasingly used for the automatic detection of semantic change; yet, a robust evaluation and systematic comparison of the choices involved has been lacking. We propose a new evaluation framework for semantic change detection and find that (i) using the whole time series is preferable over only comparing between the first and last time points; (ii) independently trained and aligned embeddings perform better than continuously trained embeddings for long time periods; and (iii) that the reference point for comparison matters. We also present an analysis of the changes detected on a large Twitter dataset spanning 5.5 years.

Relating RNN Layers with the Spectral WFA Ranks in Sequence Modelling

Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges

We analyse Recurrent Neural Networks (RNNs) to understand the significance of multiple LSTM layer... more We analyse Recurrent Neural Networks (RNNs) to understand the significance of multiple LSTM layers. We argue that the Weighted Finite-state Automata (WFA) trained using a spectral learning algorithm are helpful to analyse RNNs. Our results suggest that multiple LSTM layers in RNNs help learning distributed hidden states, but have a smaller impact on the ability to learn long-term dependencies. The analysis is based on the empirical results, however relevant theory (whenever possible) was discussed to justify and support our conclusions.

An Improved Crowdsourcing Based Evaluation Technique for Word Embedding Methods

Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, 2016

In this proposal track paper, we have presented a crowdsourcing-based word embedding evaluation t... more In this proposal track paper, we have presented a crowdsourcing-based word embedding evaluation technique that will be more reliable and linguistically justified. The method is designed for intrinsic evaluation and extends the approach proposed in (Schnabel et al., 2015). Our improved evaluation technique captures word relatedness based on the word context.

Estimating the Accuracy of Spectral Learning for HMMs

Lecture Notes in Computer Science, 2016

The version in the Kent Academic Repository may differ from the final published version. Users ar... more

Efficiency of Commercial Banks in Bangladesh-a Data Envelopment Analysis

European Journal of Economics, Finance and Administrative Sciences

ABSTRACT

Implementation Architecture of Proxy Mobile IPv6 Protocol for NS2 Simulator Software

2009 International Conference on Communication Software and Networks, 2009

Page 1. Implementation Architecture of Proxy Mobile IPv6 Protocol for NS2 Simulator software Farh... more

Data Pattern Recognition using Neural Network with Back-Propagation Training

2006 International Conference on Electrical and Computer Engineering, 2006

Page 1. 4th International Conference on Electrical and Computer Engineering ICECE 2006, 19-21 Dec... more