Skip to main content

Said Kerrache

Followers

0

Following

0

Public Views

Interests

Uploads

Papers by Said Kerrache

Link prediction in directed complex networks: combining similarity-popularity and path patterns mining

Applied intelligence, Jul 2, 2024

A Survey of Latent Factor Models in Recommender Systems

arXiv (Cornell University), May 28, 2024

Recommender systems are essential tools in the digital era, providing personalized content to use... more Recommender systems are essential tools in the digital era, providing personalized content to users in areas like ecommerce, entertainment, and social media. Among the many approaches developed to create these systems, latent factor models have proven particularly effective. This survey systematically reviews latent factor models in recommender systems, focusing on their core principles, methodologies, and recent advancements. The literature is examined through a structured framework covering learning data, model architecture, learning strategies, and optimization techniques. The analysis includes a taxonomy of contributions and detailed discussions on the types of learning data used, such as implicit feedback, trust, and content data, various models such as probabilistic, nonlinear, and neural models, and an exploration of diverse learning strategies like online learning, transfer learning, and active learning. Furthermore, the survey addresses the optimization strategies used to train latent factor models, improving their performance and scalability. By identifying trends, gaps, and potential research directions, this survey aims to provide valuable insights for researchers and practitioners looking to advance the field of recommender systems.

A Sequence-to-Sequence Approach for Arabic Pronoun Resolution

arXiv (Cornell University), May 19, 2023

This paper proposes a sequence-to-sequence learning approach for Arabic pronoun resolution, which... more This paper proposes a sequence-to-sequence learning approach for Arabic pronoun resolution, which explores the effectiveness of using advanced natural language processing (NLP) techniques, specifically Bi-LSTM and the BERT pre-trained Language Model, in solving the pronoun resolution problem in Arabic. The proposed approach is evaluated on the AnATAr dataset, and its performance is compared to several baseline models, including traditional machine learning models and handcrafted feature-based models. Our results demonstrate that the proposed model outperforms the baseline models, which include KNN, logistic regression, and SVM, across all metrics. In addition, we explore the effectiveness of various modifications to the model, including concatenating the anaphor text beside the paragraph text as input, adding a mask to focus on candidate scores, and filtering candidates based on gender and number agreement with the anaphor. Our results show that these modifications significantly improve the model's performance, achieving up to 81% on MRR and 71% for F1 score while also demonstrating higher precision, recall, and accuracy. These findings suggest that the proposed model is an effective approach to Arabic pronoun resolution and highlights the potential benefits of leveraging advanced NLP neural models.

A Recommendation Approach Based on Similarity-Popularity Models of Complex Networks

Recommender systems have become an essential tool for providers and users of online services and ... more Recommender systems have become an essential tool for providers and users of online services and goods, especially with the increased use of the Internet to access information and purchase products and services. This work proposes a novel recommendation method based on complex networks generated by a similarity-popularity model to predict ones. We first construct a model of a network having users and items as nodes from observed ratings and then use it to predict unseen ratings. The prospect of producing accurate rating predictions using a similarity-popularity model with hidden metric spaces and dot-product similarity is explored. The proposed approach is implemented and experimentally compared against baseline and state-of-the-art recommendation methods on 21 datasets from various domains. The experimental results demonstrate that the proposed method produces accurate predictions and outperforms existing methods. We also show that the proposed approach produces superior results in low dimensions, proving its effectiveness for data visualization and exploration.

A Sequence-Aware Recommendation Method Based on Complex Networks

arXiv (Cornell University), Sep 30, 2022

Online stores and service providers rely heavily on recommendation softwares to guide users throu... more Online stores and service providers rely heavily on recommendation softwares to guide users through the vast amount of available products. Consequently, the field of recommender systems has attracted increased attention from the industry and academia alike, but despite this joint effort, the field still faces several challenges. For instance, most existing work models the recommendation problem as a matrix completion problem to predict the user preference for an item. This abstraction prevents the system from utilizing the rich information from the ordered sequence of user actions logged in online sessions. To address this limitation, researchers have recently developed a promising new breed of algorithms called sequence-aware recommender systems to predict the user's next action by utilizing the time series composed of the sequence of actions in an ongoing user session. This paper proposes a novel sequence-aware recommendation approach based on a complex network generated by the hidden metric space model, which combines node similarity and popularity to generate links. We build a network model from data and then use it to predict the user's subsequent actions. The network model provides an additional source of information that improves the accuracy of the recommendations. The proposed method is implemented and tested experimentally on a large dataset. The results prove that the proposed approach performs better than state-of-the-art recommendation methods.

An Approach for Link Prediction in Directed Complex Networks based on Asymmetric Similarity-Popularity

arXiv (Cornell University), Jul 15, 2022

Complex networks are graphs representing real-life systems that exhibit unique characteristics no... more Complex networks are graphs representing real-life systems that exhibit unique characteristics not found in purely regular or completely random graphs. The study of such systems is vital but challenging due to the complexity of the underlying processes. This task has nevertheless been made easier in recent decades thanks to the availability of large amounts of networked data. Link prediction in complex networks aims to estimate the likelihood that a link between two nodes is missing from the network. Links can be missing due to imperfections in data collection or simply because they are yet to appear. Discovering new relationships between entities in networked data has attracted researchers' attention in various domains such as sociology, computer science, physics, and biology. Most existing research focuses on link prediction in undirected complex networks. However, not all real-life systems can be faithfully represented as undirected networks. This simplifying assumption is often made when using link prediction algorithms but inevitably leads to loss of information about relations among nodes and degradation in prediction performance. This paper introduces a link prediction method designed explicitly for directed networks. It is based on the similarity-popularity paradigm, which has recently proven successful in undirected networks. The presented algorithms handle the asymmetry in node relationships by modeling it as asymmetry in similarity and popularity. Given the observed network topology, the algorithms approximate the hidden similarities as shortest path distances using edge weights that capture and factor out the links' asymmetry and nodes' popularity. The proposed approach is evaluated on real-life networks, and the experimental results demonstrate its effectiveness in predicting missing links across a broad spectrum of networked data types and sizes.

A Sequence-Aware Recommendation Method based on Complex Networks

International Journal of Advanced Computer Science and Applications

Online stores and service providers rely heavily on recommendation software to guide users throug... more Online stores and service providers rely heavily on recommendation software to guide users through the vast number of available products. Consequently, the field of recommender systems has attracted increased attention from the industry and academia alike, but despite this joint effort, the field still faces several challenges. For instance, most existing work models the recommendation problem as a matrix completion problem to predict the user preference for an item. This abstraction prevents the system from utilizing the rich information from the ordered sequence of user actions logged in online sessions. To address this limitation, researchers have recently developed a promising new breed of algorithms called sequence-aware recommender systems to predict the user's next action by utilizing the time series composed of the sequence of actions in an ongoing user session. This paper proposes a novel sequence-aware recommendation approach based on a complex network generated by the hidden metric space model, which combines node similarity and popularity to generate links. We build a network model from data and then use it to predict the user's subsequent actions. The network model provides an additional information source that improves the recommendations' accuracy. The proposed method is implemented and tested experimentally on a large dataset. The results prove that the proposed approach performs better than state-of-the-art recommendation methods.

A Complex Network based Graph Embedding Method for Link Prediction

arXiv (Cornell University), Sep 11, 2022

Graph embedding methods aim at finding useful graph representations by mapping nodes to a low-dim... more Graph embedding methods aim at finding useful graph representations by mapping nodes to a low-dimensional vector space. It is a task with important downstream applications, such as link prediction, graph reconstruction, data visualization, node classification, and language modeling. In recent years, the field of graph embedding has witnessed a shift from linear algebraic approaches towards local, gradient-based optimization methods combined with random walks and deep neural networks to tackle the problem of embedding large graphs. However, despite this improvement in the optimization tools, graph embedding methods are still generically designed in a way that is oblivious to the particularities of real-life networks. Indeed, there has been significant progress in understanding and modeling complex real-life networks in recent years. However, the obtained results have had a minor influence on the development of graph embedding algorithms. This paper aims to remedy this by designing a graph embedding method that takes advantage of recent valuable insights from the field of network science. More precisely, we present a novel graph embedding approach based on the popularity-similarity and local attraction paradigms. We evaluate the performance of the proposed approach on the link prediction task on a large number of real-life networks. We show, using extensive experimental analysis, that the proposed method outperforms state-of-the-art graph embedding algorithms. We also demonstrate its robustness to data scarcity and the choice of embedding dimensionality.

Image Captioning based on Feature Refinement and Reflective Decoding

arXiv (Cornell University), Jun 16, 2022

Image captioning is the process of automatically generating a description of an image in natural ... more Image captioning is the process of automatically generating a description of an image in natural language. Image captioning is one of the significant challenges in image understanding since it requires not only recognizing salient objects in the image but also their attributes and the way they interact. The system must then generate a syntactically and semantically correct caption that describes the image content in natural language. With the significant progress in deep learning models and their ability to effectively encode large sets of images and generate correct sentences, several neural-based captioning approaches have been proposed recently, each trying to achieve better accuracy and caption quality. This paper introduces an encoder-decoder-based image captioning system in which the encoder extracts spatial features from the image using ResNet-101. This stage is followed by a refining model, which uses an attention-on-attention mechanism to extract the visual features of the target image objects, then determine their interactions. The decoder consists of an attention-based recurrent module and a reflective attention module, which collaboratively apply attention to the visual and textual features to enhance the decoder's ability to model long-term sequential dependencies. Extensive experiments performed on Flickr30K, show the effectiveness of the proposed approach and the high quality of the generated captions.

Constrained Mass Optimal Transport

arXiv (Cornell University), Jun 5, 2022

Optimal mass transport, also known as the earth mover's problem, is an optimization problem with ... more Optimal mass transport, also known as the earth mover's problem, is an optimization problem with important applications in various disciplines, including economics, probability theory, fluid dynamics, cosmology and geophysics to cite a few. Optimal transport has also found successful applications in image registration, content-based image retrieval, and more generally in pattern recognition and machine learning as a way to measure dissimilarity among data. This paper introduces the problem of constrained optimal transport. The time-dependent formulation, more precisely, the fluid dynamics approach is used as a starting point from which the constrained problem is defined by imposing a soft constraint on the density and momentum fields or restricting them to a subset of curves that satisfy some prescribed conditions. A family of algorithms is introduced to solve a class of constrained saddle point problems, which has convexly constrained optimal transport on closed convex subsets of the Euclidean space as a special case. Convergence proofs and numerical results are presented.

Optrans: A Parallel Software Library for Optimal Transport

In recent years, optimal transport has become a highly active and wide area of research, thanks t... more In recent years, optimal transport has become a highly active and wide area of research, thanks to the discovery of a number of important theoretical results and the development of an array of new applications in various fields ranging from cosmology, geophysics, oceanography, meteorology and fluid mechanics to optics, image processing and pattern recognition. Despite this ample field of applications, there is a serious lack of numerical optimal transport softwares available to the research and academic community. To remedy to this shortage, this paper introduces Optrans, a parallel library for solving time-dependent optimal transport problems in free and convexly constrained forms. Optrans is designed following an object oriented approach and exploits the capabilities of C++, the implementation language, to offer an easy-to-use programming interface and ensure easy-extendability. The library uses MPI for communication and synchronization, allowing it to run on a variety of architec...

LinkPred: A High Performance Library for Link Prediction in Complex Networks

LinkPred is a high performance parallel and distributed link prediction library that includes the... more LinkPred is a high performance parallel and distributed link prediction library that includes the implementation of the major link prediction algorithms available in the literature by development from scratch and wrapping or translating existing implementations. The library offers a unified interface that facilitates the use and comparison of link prediction algorithms by researchers as well as practitioners.

The Effect of State Space Clustering on the Performance of Simulated Annealing and Its Topology-Aware Variant

Simulated annealing is one of the most widely used algorithms for global optimization. Due to its... more Simulated annealing is one of the most widely used algorithms for global optimization. Due to its success, several variants of classical simulated annealing have been proposed. These variants may use more sophisticated neighborhood selection strategies or may employ different acceptance probabilities. Topology-aware simulated annealing is one such variant that takes into consideration the branching factor of states when performing uphill moves. The experimental evaluation done on topology-aware simulated annealing suggests the potential effect of clustering on performance. In this paper, we experimentally investigate the effect of the state space clustering on the performance of classical simulated annealing and its topology-aware variant. This is achieved through the use of networks with different degrees of clustering as search spaces. These networks are generated using the hidden metric model, a recently proposed complex network model. The results show that the effects are indeed...

Constrained time-dependent optimal transport : algorithms and application to image interpolation

LinkPred: a high performance library for link prediction in complex networks

PeerJ Computer Science, 2021

The problem of determining the likelihood of the existence of a link between two nodes in a netwo... more The problem of determining the likelihood of the existence of a link between two nodes in a network is called link prediction. This is made possible thanks to the existence of a topological structure in most real-life networks. In other words, the topologies of networked systems such as the World Wide Web, the Internet, metabolic networks, and human society are far from random, which implies that partial observations of these networks can be used to infer information about undiscovered interactions. Significant research efforts have been invested into the development of link prediction algorithms, and some researchers have made the implementation of their methods available to the research community. These implementations, however, are often written in different languages and use different modalities of interaction with the user, which hinders their effective use. This paper introduces LinkPred, a high-performance parallel and distributed link prediction library that includes the imp...

A Scalable Similarity-Popularity Link Prediction Method

Scientific Reports, 2020

Link prediction is the task of computing the likelihood that a link exists between two given node... more Link prediction is the task of computing the likelihood that a link exists between two given nodes in a network. With countless applications in different areas of science and engineering, link prediction has received the attention of many researchers working in various disciplines. Considerable research efforts have been invested into the development of increasingly accurate prediction methods. Most of the proposed algorithms, however, have limited use in practice because of their high computational requirements. The aim of this work is to develop a scalable link prediction algorithm that offers a higher overall predictive power than existing methods. The proposed solution falls into the class of global, parameter-free similarity-popularity-based methods, and in it, we assume that network topology is governed by three factors: popularity of the nodes, their similarity and the attraction induced by local neighbourhood. In our approach, popularity and neighbourhood-caused attraction a...

Scalable Link Prediction in Complex Networks Using a Type of Geodesic Distance

International journal of simulation: systems, science & technology, 2017

Interpolation Between Images by Constrained Optimal Transport

Proceedings of the International Conference on Computer Vision Theory and Applications, 2011

In this paper, the recently proposed technique of constrained optimal transport is used to interp... more In this paper, the recently proposed technique of constrained optimal transport is used to interpolate between images under specified constraints. The intensity values in both images are considered as mass distributions, and a flow of minimum kinetic energy is computed to transport the initial distribution to the final one, while satisfying specified constraints on the intermediate mass as well as the the velocity or the momentum field. As an application, the proposed method is used for interpolating between images with a common unchanged part, as well as under constraint on the volume expansion and contraction. The latter is achieved by imposing bounds on the divergence of the velocity field of the flow. This constraint is discretized then integrated into the problem Lagrangian using the augmented Lagrangian method. A variation of the solution is also presented, where the constraint is decoupled into two constraints coordinated by an additional Lagrange multiplier. This allows a considerable speedup, though numerical robustness decreases in certain cases. Constrained optimal transport can potentially be used in image registration. In particular, the proposed method for controlling the volume change has potential application in registration of images under volume change constraints as it is the case for medical images depicting muscle movements or those with contrast enhancing structures.

Speaker Identification in Different Emotional States in Arabic and English

IEEE Access, 2020

Speaker recognition is an important application of digital speech processing. However, a major ch... more Speaker recognition is an important application of digital speech processing. However, a major challenge degrading the robustness of speaker-recognition systems is variation in the emotional states of speakers, such as happiness, anger, sadness, or surprise. In this paper, we propose a speaker recognition system corresponding to three states, namely emotional, neutral, and with no consideration for a speaker's state (i.e., the speaker can be in an emotional state or neutral state), for two languages: Arabic and English. Additionally, cross-language speaker recognition was applied in emotional, neutral, and (emotional + neutral) states. Convolutional neural network and long short-term memory models were used to design a convolutional recurrent neural network (CRNN) main system. We also investigated the use of linearly spaced spectrograms as speech-feature inputs. The proposed system utilizes the KSUEmotions, emotional prosody speech and transcripts, WEST POINT, and TIMIT corpora. The CRNN system exhibited accuracies as high as 97.4% and 97.18% for Arabic and English emotional speech inputs, respectively, and 99.89% and 99.4% for Arabic and English neutral speech inputs, respectively. For the cross-language program, the overall CRNN system accuracy was as high as 91.83%, 99.88%, and 95.36% for emotional, neutral, and (emotional + neutral) states, respectively.

An Automated Advice Seeking and Filtering System

International Journal of Advanced Computer Science and Applications, 2019

Advice seeking and knowledge exchanging over the Internet and social networks became a very commo... more Advice seeking and knowledge exchanging over the Internet and social networks became a very common activity. The system proposed in this work aims to assist the users in choosing the best possible advice and allows them to exchange advice automatically without knowing each other. The approach used in this work is based on a newly proposed dynamic version of the hidden metric model, where the distance between each couple of users is computed and used to represent the users in a d dimensional Euclidean space. In addition to the position, a degree is also assigned to each user, which represents his/her popularity or how much he/she is trusted by the system. The two factors, distance and degree, are used in selecting advice providers. Both the positions of the users and their degrees are adjusted according to the feedback of the users. The proposed feedback algorithm is based on a Bayesian framework and has a goal of obtaining more accurate advice in the future. The system evaluated and tested using simulation. In the applied experiment, the mean square error was measured for different parameters. All parts of the experiments are performed on a varying number of users (100, 500 and 1000 users). This shows that the system can scale to a large number of users.