Academia.eduAcademia.edu

Data classification

description1,400 papers
group60 followers
lightbulbAbout this topic
Data classification is the process of organizing data into categories or classes based on shared characteristics or attributes. This systematic approach facilitates data management, retrieval, and analysis, enabling more efficient decision-making and information security by ensuring that data is appropriately handled according to its classification level.
lightbulbAbout this topic
Data classification is the process of organizing data into categories or classes based on shared characteristics or attributes. This systematic approach facilitates data management, retrieval, and analysis, enabling more efficient decision-making and information security by ensuring that data is appropriately handled according to its classification level.
Algorithms for the inference of association with sequential information have been proposed and used but are ineffective, in some cases, because too many candidate rules are extracted. Filtering the relevant ones is usually difficult and... more
We use the Expectation-Maximization (EM) algorithm to classify 3D aerial lidar scattered height data into four categories: road, grass, buildings, and trees. To do so we use five features: height, height variation, normal variation, lidar... more
This study explores the principal methods of data mining and their diverse applications across industries. Its purpose is to provide a comprehensive overview of key techniques-classification, clustering, association rule learning,... more
This paper investigates the existing practices and prospects of medical data classification based on data mining techniques. It highlights major advanced classification approaches used to enhance classification accuracy. Past research has... more
We propose a genetic programming based approach for generating prototypes in a classification problem. In this context, the set of prototypes to which the samples of a data set can be traced back is coded by a multitree, i.e. a set of... more
Since several decades, researchers have been interested in various types of generalized regression models which admit changing parameter values at different time periods. The so-called regime switching models have given a lot of... more
Abstract. The paper deals with a method for determining a switching combination of several local linear models using only the knowledge of the inputoutput data. The method is a direct optimisation of the sum of square errors between... more
Now-a-days privacy has become a major concern; the goals of security like confidentiality, integrity and availability do not ensure privacy. Data mining is a threat to privacy. Researchers today focus on how to ensure privacy while... more
Now-a-days privacy has become a major concern; the goals of security like confidentiality, integrity and availability do not ensure privacy. Data mining is a threat to privacy. Researchers today focus on how to ensure privacy while... more
We consider systems of equations of the form Xi=,IJ,a l &~LJ& i=l,...,n where A is the underlying alphabet, the Xi are variables, the Pi.0 are boolean functions in the variables J&B and each & is either the empty word or the empty set.... more
This paper studies some properties and generalizations of the "canonical structure" operator on strings, denoted C, which was originally introduced as part of the study of Papyrus Oxyrhynchus 90, in [Har25].
The present work treats the data classification task by means of evolutionary computation techniques using three ingredients: genetic programming, competitive coevolution, and context-free grammar. The robustness and... more
This article proposes a new classifier -inspired by a biological immune systems' characteristic-which also belongs to the class of k-nearest-neighbors algorithms. Its main feature is a suppression mechanism used to reduce the size of the... more
This paper presents hardware implementation of Artificial Neural Networks (ANN) that are used for human hand's natural gestures recognition. Main goal of this project is to implement a recognition system that recognizes data gathered from... more
4 èmes Journées de recherches en sciences sociales ; Rennes 09 et 10 décembre 2010 Le Commerce Equitable (CE) a connu une croissance accélérée depuis ces dernières années. Le nombre de coopératives certifiées est en forte augmentation. De... more
This paper presents a comparative evaluation between a classification strategy based on the combination of the outputs of a neural (NN) ensemble and the application of Support Vector Machine (SVM) classifiers in the analysis of remotely... more
, Blok and Pigozzi have shown that a deterministic finite automaton can be naturally viewed as a logical matrix. Following this idea, we use a generalisation of the matrix concept to deal with other kind of automata in the same algebraic... more
The huge size of multimedia data requires for efficient data classification and organization in providing effective multimedia data manipulation. Those valuable data must be captured and stored for potential purposes. One of the main... more
A new algorithm for identification of discrete time Hybrid Systems in the Piece-Wise Affine (PWA) form is introduced. This problem involves the estimation of both the parameters of the affine submodels and the partition of the PWA map... more
This paper mainly contributes a comprehensive survey on the climacteric security challenges imposed by cloud computing. The paper highlights the challenges/loopholes existing in cloud environment despite all the efforts adopted by... more
In this paper we discuss a data mining framework for constructing intrusion detection models. The key ideas are to mine system audit data for consistent and useful patterns of program and user behavior, and use the set of relevant system... more
In this paper we describe a data mining framework for constructing intrusion detection models. The first key idea is to mine system audit data for consistent and useful patterns of program and user behavior. The other is to use the set of... more
Anonymization is a practical approach to protect privacy in data. The major objective of privacy preserving data publishing is to protect private information in data whereas data is still useful for some intended applications, such as... more
Problems of data classification can be studied in the framework of regularization theory as ill-posed problems. In this framework, loss functions play an important role in the application of regularization theory to classification. In... more
Os resíduos sólidos urbanos (RSU's) tornam-se um dos maiores problemas das gestões públicas, uma vez que sua solução aborda diversas questões, dentre as quais as suas formas de tratamento e de disposição final. Uma das formas... more
Social media have become trendy environments for communication. Because of that, analyze the sentiment that the user expresses in their social media posts is an important research field. However, detecting polarity in such contents is a... more
As mídias sociais se tornaram um ambiente popular para comunicação. Por isso, analisar o sentimento que o usuário expressa em suas postagens nas redes sociais é um importante campo de pesquisa. No entanto, detectar a polaridade em tais... more
In this paper, an unsupervised learning algorithm, neighborhood linear embedding (NLE), is proposed to discover the intrinsic structures such as neighborhood relationships, global distributions and clustering property of a given set of... more
This paper presents a comprehensive examination of the underlying structures and behaviours of finite group automata and group machines, delving into their intricate relationships and properties. Researchers have developed a comprehensive... more
Through the use of patient information, data mining has proven to be a powerful tool in the healthcare sector for tracking illness patterns. While diagnostics are crucial for accurate medical diagnosis and prognosis, the complex nature of... more
Beyin-bilgisayar arayüzleri (BBA) baglamında zihinde hareket canlandırma sürecinde toplanan EEG verilerinin sınıflandırılması problemini ele alıyoruz. Saklı Markov Modelleri (HMM) üzerine kurulu bir yaklas ¸ım öneriyoruz. Yaklas ¸ımımız... more
Natural Language Processing has become one of the revolutionary technologies in data governance, particularly in enhancing metadata management and data catalogues. The explosive growth of data brings forth several issues for an... more
An imbalanced classification problem is one in which the distribution of instances across defined classes is uneven or biased in one direction or another. In data mining, the probabilistic neural network (PNN) classifier is a well-known... more
OVERVIEW: When learning defect detectors from static code measures, NaiveBayes learners are better than entrophy-based decision-tree learners. Also, accuracy is not a useful way to assess those detectors. Further, those learners need no... more
Distributed denial of service (DDoS) attacks involves disrupting a target system by flooding it with an immense volume of traffic originating from numerous sources. These attacks can disrupt online services, causing financial losses... more
L' IA est un simple avatar de l'intelligence humaine. Il s'agit en qq sorte de reproduire les différents processus intellectuels à l'oeuvre pour la résolution de problèmes et de prise de décision.La seule différence réside à la fois dans... more
Statistical classification of hyperspectral data is challenging because the input space is high in dimension and correlated, but labeled information to characterize the class distributions is typically sparse. The resulting classifiers... more
Weighted finite-state transducers are used in many applications such as text, speech and image processing. This chapter gives an overview of several recent weighted transducer algorithms, including composition of weighted transducers,... more
In this paper, we examined three vector quantization (VQ) methods used for the unsupervised classification (clustering) of functional magnetic resonance imaging (fMRI) data. Classification means that each brain volume element (voxel),... more
This paper introduces autoregressive (AR) modeling as a novel method to classify outputs from gas chromatography (GC). The inverse Fourier transformation was applied to the original sensor data, and then an AR model was applied to... more
In the past decade, there has been a sharp increase in publications describing applications of convolutional neural networks (CNNs) in medical image analysis. However, recent reviews have warned of the lack of reproducibility of most such... more
Breast cancer (BC) is a major global health concern. Detecting BC at an early stage gives more treatment options and can help avoid more aggressive treatments. The use of machine learning (ML) in BC prediction offers significant potential... more
Online Social Networks (OSNs) face escalating se- curity threats that imperil user privacy. Conventional Deep Learning methods, relying predominantly on fixed learning rates, encounter limitations when capturing the nuanced intricacies of... more
Data has become an indispensable part of our daily lives in this era of information age. The amount of data which is generated is growing exponentially due to technological advances. This voluminou ...
Classifying educational data into a particular category remains challenging due to the massive and extensive number of variables within the dataset. This paper emphasizes a new algorithm for variational inclusion problems with the... more
Numerous research in the field of education analytics has attempted to discover a significant indicator and predictor of the digital proficiency level of pre-service teachers. While university course alterations in their academic... more
Et@ de rorponrablllté 3ochlo et dê doveloppcment durrblo FG UaÀM sur le commeroe équitable : commerce équitable et développement durable fair trade and sustainable development sobre el comercio justo: El comercio justo y los objetivos del... more
Given a sequence of image pairs we describe a method that segments the observed scene into static and moving objects while it rejects badly matched points. We show that, using a moving stereo rig, the detection of motion can be solved in... more
Download research papers for free!