Academia.eduAcademia.edu

Data Minig

description23 papers
group132 followers
lightbulbAbout this topic
Data mining is the computational process of discovering patterns, correlations, and anomalies in large datasets using statistical, mathematical, and machine learning techniques. It aims to extract valuable information and insights from data to support decision-making and predictive analysis.
lightbulbAbout this topic
Data mining is the computational process of discovering patterns, correlations, and anomalies in large datasets using statistical, mathematical, and machine learning techniques. It aims to extract valuable information and insights from data to support decision-making and predictive analysis.

Key research themes

1. What advanced data reduction strategies optimize efficient processing of large-scale and resource-constrained datasets?

This theme explores explicit strategies for reducing large volumes of data in various application domains including aircraft structural health monitoring, wireless sensor networks (WSNs), embedded systems, and scientific simulations. The research focuses on balancing data fidelity, computational constraints, and transmission costs through algorithmic innovations such as signal processing-based data compression, prediction-based reduction, and statistical selection of metrics. Understanding these strategies is crucial for maintaining system efficiency and reliability when handling voluminous or complex data under resource limitations.

Key finding: This paper identifies digital signal processing methods, such as recursive singular spectrum analysis and digital filtering, as key enablers for data reduction in aircraft health monitoring. It emphasizes that signal... Read more
Key finding: The study shows that applying a Least Mean Square algorithm with variable step size (LMS-VSS) for data prediction significantly reduces wireless sensor node transmissions by up to 97%, thus achieving energy savings without... Read more
Key finding: This work formulates a software product line approach to develop tailor-made data management systems optimized for resource-constrained embedded devices. By downsizing existing database management systems and enabling high... Read more
Key finding: The authors introduce a novel compression scheme based on statistical exchangeability and the Kolmogorov-Smirnov test to identify blocks of streaming scientific data with similar distributions, enabling reconstruction from a... Read more
Key finding: The paper proposes a Fog computing architecture that performs on-site data mining and pattern recognition on raw multimodal wearable sensor data to greatly reduce transmission loads to the cloud. Implemented on low-power... Read more

2. How can lossless and lossy compression algorithms be tailored and benchmarked for domain-specific data?

This research theme addresses the design, adaptation, and empirical evaluation of compression techniques—both lossless and lossy—crafted for specific types of data such as remote sensing images, scientific simulation outputs, and transactional databases. The focus includes algorithmic developments like Huffman and arithmetic coding, specialized transformations like Karhunen-Loève Transform (KLT), and comprehensive benchmarking platforms evaluating compression ratios, error bounds, and application-specific fidelity requirements. These studies inform best practices for data storage and transmission where data integrity and domain-specific accuracy guarantees are critical.

Key finding: This paper details the implementation and comparative performance analysis of classical lossless compression algorithms—Huffman coding and arithmetic coding. It clarifies how Huffman codes assign shorter codes to higher... Read more
Key finding: SDRBench establishes a standardized benchmark comprising over ten diverse scientific datasets from multiple domains to evaluate lossy compressors fairly. It integrates state-of-the-art compressors, assesses critical quality... Read more
Key finding: The survey categorizes remote sensing data compression into classification/clustering, lossless, and lossy methods, emphasizing the importance of feature extraction to reduce dimensionality prior to transmission. It discusses... Read more
Key finding: The work critiques traditional mean square error (MSE) based compression techniques and introduces a compression framework guaranteeing a strict per-point error bound (L∞ norm) on three-dimensional measurement data relevant... Read more
Key finding: This paper proposes a hybrid data reduction method combining feature selection and instance reduction specifically tailored to transactional data classification in high-normalization environments. It addresses the challenge... Read more

3. What roles do data mining, predictive analytics, and knowledge discovery play in transforming raw data into actionable insights across domains?

This theme investigates the methodologies and frameworks for extracting meaningful knowledge from raw or streaming datasets through data mining and predictive analytics. The focus includes conceptual foundations distinguishing data, information, and knowledge; the application of robust regression techniques (e.g., Least Clipped Absolute Deviation); text and learning analytics interactions; and social media influence studies. These works collectively elucidate processes to automate pattern detection, forecasting, and informed decision-making in academic, telehealth, education, and social networking contexts.

Key finding: This chapter delineates the conceptual progression from raw data to actionable knowledge via data mining techniques, explaining how automated or semi-automated pattern discovery (e.g., association rules, classification)... Read more
Key finding: The authors introduce a novel robust statistical estimator named least clipped absolute deviation (LCAD), which generalizes the skipped median concept to regression, enhancing robustness to asymmetric outliers. The method... Read more
Key finding: Employing Khan's E-Learning Framework, this work contextualizes educational data analytics in enhancing pedagogical strategies, technological integration, interface design, and ethical governance in learning environments. It... Read more
Key finding: Utilizing exploratory research with quantitative methods, the study assesses the influence of social networking sites (SNS) on academic behavior, study habits, and social interactions among students and scholars. Data mining... Read more
Key finding: This paper proposes a dynamic invariant detection framework leveraging data mining methods, including association rule mining and decision trees, to infer conditional invariants—program properties true under specific... Read more

All papers in Data Minig

Over the past era, subgraph mining from a large collection of graph database is a crucial problem. Existing works on subgraph mining is based on the threshold value, which returns similar graphs for a query graph. However, the number of... more
Individuals with emotional problems experience uncontrollable and intensive negative affect. They do not have the ability to manage and regulate their acute emotional experiences. The main aim of the present study was to compare... more
Individuals with emotional problems experience uncontrollable and intensive negative affect. They do not have the ability to manage and regulate their acute emotional experiences. The main aim of the present study was to compare... more
The medical diagnosis is a fundamental process for the identification of diseases, so this research explores some methods of data classification based on patterns such as decision trees, cluster a nd principal components analysis, the... more
Nos últimos anos, temos assistido a várias e profundas alterações na produção industrial. Muitos processos industriais estão agora automatizados com o objetivo de garantir a qualidade da produção e minimizar os seus custos. Atualmente, as... more
Sequential pattern mining is a popular data mining task with wide applications. However, the set of all sequential patterns can be very large. To discover fewer but more representative patterns, several compact representations of... more
Large Scale Graph Matching (LSGM) is one of the fundamental problems in Graph theory and it has applications in many areas such as Computer Vision, Machine Learning, Pattern Recognition and Big Data Analytics (Data Science). Matching... more
Allocating urban land-uses to land-units with regard to different criteria and constraints is considered as a spatial multiobjective problem. Generating various urban land-use layouts with respect to defined objectives for urban land-use... more
La información se ha convertido en algo cada vez más importante para nosotros. El Presente estudio muestra una introducción al concepto de data Mining como una herramienta ala que muchas empresas y organizaciones recurren para mejorar sus... more
Higher education for the 21st century continues to promote discoveries in the field through learning analytics (LA). The problem is that the rapid embrace of of LA diverts educators’ attention from clearly identifying requirements and... more
Allocating urban land-uses to land-units with regard to different criteria and constraints is considered as a spatial multiobjective problem. Generating various urban land-use layouts with respect to defined objectives for urban land-use... more
A common car insurance fraud is to submit false daims or to pad up the severity of an accident. The growing performance of both the information processing systems and the storage capacities however creates new possibilities to deal with... more
Creation oriented software allows the user to work according to their own vision and rules. From the perspective of software analysis, this is challenging because there is no certainty as to how the users are using the software and what... more
Malicious detection is a recent area of research in which most previous work had focused on the identification of malicious text on social media. However, in this project we present a methodology based on two new aspects: the detection of... more
Internet is reached everywhere, and it has taken major evolution in each sector. There are large number of active as well as passive users of Internet, related websites, SNS, applications, etc. Social Networking is one of the techniques... more
Higher education for the 21st century continues to promote discoveries in the field through learning analytics (LA). The problem is that the rapid embrace of of LA diverts educators’ attention from clearly identifying requirements and... more
Higher education for the 21st century continues to promote discoveries in the field through learning analytics (LA). The problem is that the rapid embrace of of LA diverts educators’ attention from clearly identifying requirements and... more
Large Scale Graph Matching (LSGM) is one of the fundamental problems in Graph theory and it has applications in many areas such as Computer Vision, Machine Learning, Pattern Recognition and Big Data Analytics (Data Science). Matching... more
Software engineering included some different process such as designing, implementing and modifying of software. All these processes are done to have fast developed software as well as reach a high quality, efficient and maintainable... more
Data mining is an activity of extracting some useful knowledge from a large data base, by using any of its techniques. In this paper we are using classification, one of the major data mining models, which is used to predict previously... more
Along with commercial accessibility of high resolution satellite images in recent decades, the issue of extracting accurate 3D spatial information in many fields became the centre of attention and applications related to photogrammetry... more
Distributed System, plays a vital role in Frequent Subgraph Mining (FSM) to extract frequent subgraph from Large Graph database. It help to reduce in memory requirements, computational costs as well as increase in data security by... more
Now a days, cervical cancer is treated as one of the main causes of death of women due to cancer in worldwide. In this study, we collected 741 instances of cervical cancer data, preprocessed data, explored high ranked significant features... more
Software engineering included some different process such as designing, implementing and modifying of software. All these processes are done to have fast developed software as well as reach a high quality, efficient and maintainable... more
Association rule mining forms the core of data mining and it is termed as one of the well-researched techniques of data mining. It aims to extract interesting correlations, frequent patterns, associations or casual structures among sets... more
Monitoring Land use changes is one of the important applications of remote sensing and geographic information system. In this study, a framework for change monitoring in multitemporal satellite images is presented by Iteratively... more
SPMF is an open-source data mining library, specialized in pattern mining, offering implementations of more than 120 data mining algorithms. It has been used in more than 310 research papers to solve applied problems in a wide range of... more
Classification is the most common method for information extraction from remotely sensed images. Conventional classification methods are mostly based on spectral information. While particularly in high spatial resolution images, spatial... more
SPMF is an open-source data mining library, specialized in pattern mining, offering implementations of more than 120 data mining algorithms. It has been used in more than 310 research papers to solve applied problems in a wide range of... more
Sequential rule mining is an important data mining problem with multiple applications. An important limitation of algorithms for mining sequential rules common to multiple sequences is that rules are very specific and therefore many... more
We present an algorithm for mining sequential rules common to several sequences, such that rules have to appear within a maximum time span. Experimental results with real-life datasets show that the algorithm can reduce the execution... more
Traditional manner for making information from data was laid on handy interpret and analyzing. This kind of analyzing of data is so slow and also expensive and objective. It hopes computer technology helps when data using and analyzing is... more
Monitoring Land use changes is one of the important applications of remote sensing and geographic information system. In this study, a framework for change monitoring in multitemporal satellite images is presented by Iteratively... more
The Aryan tribes of India, the veda civilization and culture, and the Aryan tribes of Iran, founded the Aosta civilization and culture. The study and recognition of the myths of Iran and India is a suitable platform for understanding and... more
Nowadays banking systems collecting the large amount of data in day by day. Thus the collected data’s are customer information, transaction details, and credit card details. Many of the decisions are taken from one day. The taken... more
Predicting the next item of a sequence over a finite alphabet is highly important in Web Mining. This paper presents a solution to improve the performance of sequence prediction; first and foremost, predicting what is the next Web page... more
تاریخ فرهنگی رویکردی میان‌رشته‌ای است که تلاش می‌کند دو حوزۀ اصلی علوم انسانی یعنی تاریخ و فرهنگ را با یکدیگر در تعامل ببیند. تاریخ فرهنگی از شاخه‌های پررونقی است که از دهۀ 1970 به بعد گسترش یافته است. در واقع می‌توان گفت که تاریخ فرهنگی... more
Graphs play notable role in daily life. For instance, they are used in variety fields such as social networks, malware detection, and biological networks. Graph data processing performed to extract useful information is known as graph... more
Graphs are widely used to model complicated structures and link them with each other. Some of such structures are XML documents, social networks, and computer networks. Information and model extraction from graph databases is a graph... more
Download research papers for free!