Papers by Mohamed Quafafou

Entropy
Despite its undeniable success, classical machine learning remains a resource-intensive process. ... more Despite its undeniable success, classical machine learning remains a resource-intensive process. Practical computational efforts for training state-of-the-art models can now only be handled by high speed computer hardware. As this trend is expected to continue, it should come as no surprise that an increasing number of machine learning researchers are investigating the possible advantages of quantum computing. The scientific literature on Quantum Machine Learning is now enormous, and a review of its current state that can be comprehended without a physics background is necessary. The objective of this study is to present a review of Quantum Machine Learning from the perspective of conventional techniques. Departing from giving a research path from fundamental quantum theory through Quantum Machine Learning algorithms from a computer scientist’s perspective, we discuss a set of basic algorithms for Quantum Machine Learning, which are the fundamental components for Quantum Machine Lea...
International audienceLes actes de la 9e édition de la Conférence Internationale Francophone sur ... more International audienceLes actes de la 9e édition de la Conférence Internationale Francophone sur la Science des Données (CIFSD, https://cifsd-2021.sciencesconf.org) regroupe l'ensemble des contributions présentées à la conférence entre le 9 et le 11 juin 2021. Cette édition a été organisée par Aix-Marseille Université et le Laboratoire d'Informatique et Systèmes (LIS UMR 7020). En raison de la situation sanitaire, elle s'est déroulée en distanciel depuis Marseille (France). La thématique mise en avant pour cette édition a été la science de données pour la santé
IFIP Advances in Information and Communication Technology, 2019
The combination of multiple classifiers can produce an optimal solution than relying on the singl... more The combination of multiple classifiers can produce an optimal solution than relying on the single learner. However, it is difficult to select the reliable learning algorithms when they have contrasted performances. In this paper, the combination of the supervised learning algorithms is proposed to provide the best decision. Our method transforms a classifier score of training data into a reliable score. Then, a set of reliable candidates is determined through static and dynamic selection. The experimental result of eight datasets shows that our algorithm gives a better average accuracy score compared to the results of the other ensemble methods and the base classifiers.

Over the last twenty years, information integration has received considerable efforts from both i... more Over the last twenty years, information integration has received considerable efforts from both industry and academia. Approaches to information integration developed so far can be categorized as follows: (1) first-generation approaches, that require the definition of a global schema and a semantic integration which should be performed upfront (before query execution); (2) second-generation approaches, well illustrated by the dataspace management concept, which promote a pay-asyou-go data integration. The first category has led to well known mediation approaches such as GAV (Global as View), LAV (Local as View), GLAV (Generalized Local As View), BAV (Both As View), and BGLAV (BYU Global-Local-as-View). Approaches pertaining to the second category are geared towards the development of dataspace management systems and are currently gaining a lot of attention. In this chapter we are interested in exploiting both types of approaches in querying conflicting data spread over multiple web ...

Social microblogging services have an especially significant role in our society. Twitter is one ... more Social microblogging services have an especially significant role in our society. Twitter is one of the most popular microblogging sites used by people to find relevant information (e.g., breaking news, popular trends, information about people of interest, etc). In this context, retrieving information from such data has recently gained growing attention and opened new challenges. However, the size of such data and queries is usually short and may impact the search result. Query Expansion (QE) has a main task in this issue. In fact, words can have different meanings where only one is used for a given context. In this paper, we propose a QE method by considering the meaning of the context. Thus, we use patterns and Word Embeddings to expand users’ queries. We experiment and evaluate the proposed method on the TREC 2011 dataset containing approximately 16 million tweets and 49 queries. Results revealed the effectiveness of the proposed approach and show the interest of combining patter...

Geo-FUZZ: Fuzzy-based algorithm for suspicious geo-tagged tweets detection
2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2018
Social media such as Twitter is becoming an increasingly incredible source for capturing and anal... more Social media such as Twitter is becoming an increasingly incredible source for capturing and analyzing users’ conversations. However, information published by users may contain dangerous contents and give negative influence to other users. In this paper, we proposed an algorithm for detecting suspicious tweets based on fuzzy logic and probabilistic methods from geo-tagged tweets. The novelty of our work is by considering tweets location, labeling regions and classifying tweets based not only on text contents but also on the region where the message is posted. Moreover, this method generates a geo-tagged map to easily visualize the classified tweets. The experimental results show that the Geo-FUZZ proposed algorithm is more accurate compared to the previous algorithm FUZZ-STD. Furthermore, the visualization results of hot-suspicious zone allow us to identify risk early to decrease its impact, and control the spread of abnormal similar tweet for security.
In ensemble learning field, the voting of different experts can produce an optimal solution. Howe... more In ensemble learning field, the voting of different experts can produce an optimal solution. However, the quality of voting depends on the participant expertise. In this paper, an expert selection algorithm is proposed by considering reliability measure extracted from the confidence score. Our method has been applied based on the combination of 6 algorithms. Experimental result using 8 datasets shows that the proposed reliable majority voting algorithm provides a better average accuracy than the ordinary majority voting and the base classifiers. keyword: reliable majority voting, classification, ensemble learning.

We propose to enlarge research in cSON (community Semantic Overlay Network) a semantic overlay ne... more We propose to enlarge research in cSON (community Semantic Overlay Network) a semantic overlay network that requires an organization of peers into communities. cSON is developed for an efficient research in unstructured Peer-to- Peer (P2P) system. In this paper, a community is composed of one super-peer (e.g. the administrator of the community) and of several peers: the super-peer describes a domain (e.g. ontology) and a peer (with its own data schema) joins one community which is in accordance with its own domain. One challenge in cSON is how to efficiently extend the scope of the research to all communities (and not only to neighboring communities). We propose an algorithm that builds, starting from the administrator of a community, a Maximal-affinity Covering Tree (MCT). The obtained MCTs are used later by queries in order to search in each community the pertinent peers. We give a performance evaluation concerning the creation of a MCT and we compare then our routing algorithm wi...

Transactions on Large-Scale Data- and Knowledge-Centered Systems XXVI, 2016
The approach F IBAD is introduced with the purpose of computing approximate borders of frequent i... more The approach F IBAD is introduced with the purpose of computing approximate borders of frequent itemsets by leveraging dualization and computation of approximate minimal transversals of hypergraphs. The distinctiveness of the F IBAD's theoretical foundations is the approximate dualization where a new function f is defined to compute the approximate negative border. From a methodological point of view, the function f is implemented by the method AM T HR that consists of a reduction of the hypergraph and a computation of its minimal transversals. For evaluation purposes, we study the sensibility of F IBAD to AM T HR by replacing this latter by two other algorithms that compute approximate minimal transversals. We also compare our approximate dualization-based method with an existing approach that computes directly, without dualization, the approximate borders. The experimental results show that our method outperforms the other methods as it produces borders that have the highest quality.
2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), Dec 1, 2018
Tracking technologies and location-acquisition have led to the increase of the availability of tr... more Tracking technologies and location-acquisition have led to the increase of the availability of trajectory data. Many efforts are devoted to develop methods for mining and analysing trajectories due to its importance in lots of applications such as traffic control, urban planning etc. In this paper, we present a new trajectory analysis and visualisation framework for massive movement data. This framework leverages formal concepts, sequential patterns, emerging patterns, and analyses the evolution of mobility patterns through time. Tagged city maps are generated to display the resulting evolution analysis and directions at different spatio-temporal granularity values. Experiments on real-world dataset show the relevance of the proposition and the usefulness of the resulting tagged city maps.

International Journal of Data Science and Analytics, 2018
Trajectory mining is a challenging and crucial problem especially in the context of smart cities ... more Trajectory mining is a challenging and crucial problem especially in the context of smart cities where many applications depend on human behaviors. In this paper, we characterize such behaviors by patterns, where each pattern type represents a particular behavior, e.g. emerging, latent, lost, etc. From GPS raw data, we introduce algorithms that allow computing a formal concept lattice which encodes optimal correspondences between hidden patterns and trajectories. In order to detect behaviors, we propose an algorithm that analyses the evolution of the discovered formal concepts over time. The method generates tagged city maps to easily visualize the resulting behaviors at different spatio-temporal granularity values. Refined or coarse analysis can thus be performed for a given situation. Experimental results using real-world GPS trajectory data show the relevance of the proposed method and the usefulness of the resulting tagged city maps.

Enseignement de la prospection miniere : contribution informatique
Http Www Theses Fr, 1992
Ce travail est une contribution a l'etude du probleme de l'enseignement des techniques de... more Ce travail est une contribution a l'etude du probleme de l'enseignement des techniques de la prospection miniere par sondages en se basant sur les acquis de l'Enseignement Assiste par Ordinateur. Deux classes de problemes sont rencontrees generaux (lies a l'utilisation pedagogique de l'ordinateur) et specifiques (resultat du mariage de l'Informatique et de la Geologie). Nous etudions les systemes a vocation pedagogique en general (difficultes, problemes et evolution) et ceux dedies a la prospection miniere en particulier. Nous nous sommes demarques des travaux precedemment effectues dans le domaine minier pour ne pas reduire le probleme de la prospection miniere a sa seule dimension geometrique. Nous proposons alors une architecture tenant compte des deux classes de problemes precedentes. La simulation d'une campagne de prospection necessite la modelisation et la manipulation de connaissances geologiques afin de decrire un site geologique synthetique. Les problemes les plus importants relevent souvent de la realisation d'environnement graphiques. Nous avons alors etudie 1'interet de 1' approche objet dans la conception des systemes graphiques. Du point de vue pedagogique, nous avons formalise l'action de l'eleve en proposant une structuration de l'expertise du domaine (prospection miniere) et d'enseignement.

JUTI: Jurnal Ilmiah Teknologi Informasi, 2015
2) ABSTRAK Klasifikasi adalah bagian dari sistem pembelajar yang fokus pada pemahaman pola melalu... more 2) ABSTRAK Klasifikasi adalah bagian dari sistem pembelajar yang fokus pada pemahaman pola melalui representasi dan generalisasi data. Penentuan prediksi hasil klasifikasi terbaik menjadi masalah jika terdapat beberapa masukan dari metode yang berbeda-beda pada lingkungan data yang heterogen. Penggabungan keputusan dapat digunakan untuk menentukan rekomendasi keluaran beberapa metode klasifikasi. Kami memilih pendekatan voting dan meta-learning sebagai metode penggabungan keputusan. Ada dua fase yang dilakukan pada penelitian ini, yaitu fase pembangunan prediksi oleh metode klasifikasi yang heterogen dan fase penggabungan rekomendasi metode-metode tersebut menjadi satu kesimpulan jawaban. Karakteristik klasifikasi yang menjadi fokus adalah klasifikasi multi-label. Binary Relevance (BR), Classifier Chains (CC), Hierarchichal of Multi-label Classifier (HOMER), dan Multi-label k Nearest Neighbors (MLkNN) adalah metode klasifikasi yang digunakan sebagai penyedia rekomendasi prediksi melalui pendekatan yang berbeda-beda. Pada fase penggabungan keputusan, metode Ignore diajukan sebagai pendekatan meta-learning. Ignore menggabungkan keputusan dengan cara mempelajari pola masukan dari sistem pembelajar. Untuk membandingkan kinerja Ignore, metode konsensus digunakan sebagai pendekatan voting. Hasil akhir menunjukkan bahwa Ignore memberikan hasil terbaik untuk parameter recall. Ignore memprediksi nilai false negative lebih sedikit dibandingkan dengan metode konsensus 0,5 dan 0,75. Hasil studi ini menunjukkan bahwa Ignore dapat digunakan sebagai meta-learning, meskipun kinerja Ignore harus diperbaiki agar dapat beradaptasi dengan data yang heterogen.
Computer Science & Information Technology ( CS & IT ), 2014
In Web Service research, providing methods and tools to cater for automatic composition of servic... more In Web Service research, providing methods and tools to cater for automatic composition of services on the Web is still the object of ongoing research activity. Despite the proposed approaches this issue remains open. In this paper we propose a seamless way to compose automatically web services from expressed abstract process model. The process of composition is based on web service popularity concept. To validate our approach an implementation is presented.
Perception in Human-Computer Symbiosis
Today computers and more generally smart technology do not take into account the diversity of per... more Today computers and more generally smart technology do not take into account the diversity of perception leading to the exclusion of the plurality of representation and decision even if such diversity may play a crucial role in human-computer interaction especially in our small world. We introduce in this paper a conceptual framework developing a bridge between set and perception theories to support computing with perceptions. In this context, human-machine interaction is not only guided by computation but it is also based on human-human interaction through machines and social networks.

Information Systems, 2011
This paper describes a process for mashing heterogeneous data sources based on the Multi-data sou... more This paper describes a process for mashing heterogeneous data sources based on the Multi-data source Fusion Approach (MFA) [? ]. The aim of MFA is to facilitate the fusion of heterogeneous data sources in dynamic contexts such as the Web. Data sources are either static or active: static data sources can be structured or semi-structured (e.g. XML documents or databases), whereas active sources are services (e.g. Web services). Our main objective is to combine (Web) data sources with a minimal effort required from the user. This objective is crucial because the mashing process implies easy and fast integration of data sources. We suppose that the user is not expert in this field but he/she understands the meaning of data being integrated. In this paper, we consider two important aspects of the Web mashing process. The first one concerns the information extraction from the Web. The results of this process are the static data sources that are used later together with services in order to create a new result/application. The second one concerns the problem of semantic reconciliation of data sources. This step consists to (re-)generate the Conflict data source in order to improve the problem of rewriting semantic queries into sub-queries (not addressed in this paper) over data sources. We give the design of our system MDSManager. We show this process through a real-life application.
Résumés d'instances pour l'extraction de connaissances à partir de données relationnelles et textuelles
Ce papier présente une approche pour la recherche d'entités nommées dans des transcriptions radio... more Ce papier présente une approche pour la recherche d'entités nommées dans des transcriptions radiophoniques. Nous allons utiliser les structures des entités nommées afin d'améliorer le taux de leur reconnaissance. En effet, l'espace des entités peut être représenté par une structure hiérarchique (arbre). Ainsi, un concept peut être vu comme un noeud dans l'arbre, et une entité comme un parcours dans la structure de l'espace. Nous allons montrer l'apport de cette représentation en utilisant le modèle des Champs Aléatoires Conditionnels (CAC). La comparaison de notre approche avec la méthode des Modèles de Markov Cachés (MMC) montre une amélioration de la reconnaissance en utilisant les CAC Combinés. Nous montrons également l'impact de l'utilisation des informations a priori dans le processus en incluant les informations syntaxiques des transcriptions comme nouveau contexte.

The objectives of this research work which is intimately related to pattern discovery and managem... more The objectives of this research work which is intimately related to pattern discovery and management are threefold: (i) handle the problem of pattern manipulation by defining operations on patterns, (ii) study the problem of enriching and updating a pattern set (e.g., concepts, rules) when changes occur in the user's needs and the input data (e.g., object/attribute insertion or elimination, taxonomy utilization), and (iii) approximate a "presumed" concept using a related pattern space so that patterns can augment data with knowledge. To conduct our work, we use formal concept analysis (FCA) as a framework for pattern discovery and management and we take a joint database-FCA perspective by defining operators similar in spirit to relational algebra operators, investigating approximation in concept lattices and exploiting existing work related to operations on contexts and lattices to formalize such operators.
Uploads
Papers by Mohamed Quafafou