Academia.eduAcademia.edu

local outlier factor (LOF)

description231 papers
group0 followers
lightbulbAbout this topic
Local Outlier Factor (LOF) is an algorithm used in anomaly detection that identifies outliers in a dataset by measuring the local density deviation of a data point with respect to its neighbors. It quantifies how isolated a point is compared to its surrounding points, allowing for the detection of local anomalies.
lightbulbAbout this topic
Local Outlier Factor (LOF) is an algorithm used in anomaly detection that identifies outliers in a dataset by measuring the local density deviation of a data point with respect to its neighbors. It quantifies how isolated a point is compared to its surrounding points, allowing for the detection of local anomalies.

Key research themes

1. How can Local Outlier Factor (LOF) algorithms be adapted for scalable, real-time outlier detection in dynamic, high-dimensional data environments such as data streams and spatio-temporal data?

This research theme focuses on the challenges and methods of extending LOF algorithms—originally designed for static datasets—to handle the complexities of big data streams, multidimensional data, and spatio-temporal outliers. Key issues include computational scalability, concept drift responsiveness, and integrating spatial-temporal context for improved detection accuracy. Addressing these challenges enables LOF to operate effectively in domains requiring continuous anomaly monitoring such as network intrusion detection, sensor networks, and environmental monitoring.

Key finding: This paper systematically reviews LOF and its variants developed for data stream environments, highlighting the limitations of traditional LOF on static datasets when applied to streaming data due to concept drift and volume... Read more
Key finding: The authors introduce ST-BOF, a spatio-temporal extension to LOF, enabling simultaneous evaluation of spatial and temporal contexts. Their ST-BDBCAN and Approx-ST-BDBCAN algorithms leverage ST-BOF for clustering and outlier... Read more
Key finding: This work proposes an unsupervised online outlier detection technique for multi-dimensional data streams using Relative Neighbourhood Dissimilarity (ReND). ReND adaptively learns from streaming data under concept drift,... Read more
Key finding: The paper presents a novel sampling-based outlier detection method designed for large, high-dimensional datasets that combines randomized partitioning with efficient sampling to create a candidate outlier set. This approach... Read more

2. What methodologies and statistical measures optimize the robustness and accuracy of outlier detection in multivariate, skewed, and circular data distributions beyond classical LOF applications?

This research theme investigates adapting LOF and other related methods to specialized data types such as skewed multivariate distributions and circular data, emphasizing robustness to data characteristics that invalidate assumptions of classical methods. It explores combining LOF with statistical depth functions, adjusted boxplots, and alternative metrics for better detection performance in these non-standard data spaces critical to applications like directional data analysis, regression diagnostics, signal processing, and environmental measurements.

Key finding: This study compares the outlier detection capability of four robust outlyingness functions tailored for multivariate skew-normal distributions, noting that traditional Mahalanobis distance methods are insufficient for skewed... Read more
Key finding: The authors extend LOF to univariate circular data by mapping angular observations to bivariate Cartesian coordinates and applying LOF in this transformed space. This approach overcomes limitations of existing... Read more
Key finding: Focusing on univariate location-scale families including skewed distributions, this paper modifies traditional boxplot fences by employing semi-interquartile ranges and controlling false positive outlier rates over sample... Read more
Key finding: This paper proposes a combination of the distribution-free Harrell-Davis median estimator with robust covariance estimation to handle simultaneous cellwise and casewise multivariate outliers, improving classification accuracy... Read more

3. How can Local Outlier Factor (LOF) be integrated with advanced machine learning and deep learning techniques to improve anomaly detection in complex rule-based systems and cybersecurity?

This research avenue explores hybrid approaches that combine LOF with modern AI architectures including autoencoders, attention mechanisms, and clustering algorithms to optimize detection of anomalous patterns in rule-based knowledge bases, network security, and related domains. The focus is on enhancing feature representation, temporal dependency modeling, and leveraging unsupervised learning paradigms to complement LOF's density-based outlier scoring for improved precision and reduced false alarms.

Key finding: This research demonstrates the synergistic use of LOF and Self-Organizing Maps (SOM) to identify unusual or rare rules in rule-based knowledge bases, improving completeness and quality of decision support systems. LOF... Read more
Key finding: The paper presents a semi-supervised framework combining an attention-equipped GRU autoencoder with LOF for feature extraction and anomaly scoring in DDOS detection. The integration of temporal feature learning (via GRU and... Read more
Key finding: By incorporating the LOF algorithm within a hybrid machine learning and statistical framework, this study improves unsupervised anomaly detection in high-dimensional flight data monitoring. The methodology adapts LOF to... Read more
Key finding: This work enhances collaborative outlier detection by employing locality-sensitive hashing (LSH) ensemble methods combined with LOF principles, enabling mergeable and privacy-preserving model aggregation across decentralized... Read more

All papers in local outlier factor (LOF)

In the age of smart manufacturing, there are typically multitude of sensors that are connected to each assembly line. The amount of data generated could be used to create a digital twin model of the complete process; wherein virtual... more
A single outlier detection procedure for data generated from BL(1,1,1,1) models is developed. It is carried out in three stages. Firstly, the measure of impact of an IO and AO denoted by IO ω , AO ω , respectively are derived based on... more
Outlier Detection is a Data Mining Application. Outlier contains noisy data which is researched in various domains. The various techniques are already being researched that is more generic. We surveyed on various techniques and... more
Random Forest (RF) is an ensemble classification technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there... more
An outlier ranking tree selection approach to extreme pruning of random forests. FAWAGREH, K., GABER, M.M. and ELYAN, E. 2016. An outlier ranking tree selection approach to extreme pruning of random forests. In Jayne, C. and Iliadis, L.... more
Thompson's partition of a cyclic subnormal operator into normal and completely non-normal components is combined with a noncommutative calculus for hyponormal operators for separating outliers from the cloud, in rather general point... more
Feature Selection is the preprocessing process of identifying the subset of data from large dimension data. To identifying the required data, using some Feature Selection algorithms. Like Relief, Parzen-Relief algorithms, it attempts to... more
Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as... more
Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as... more
Anomaly detection is referred to as a process in which the aim is to detect data points that follow a different pattern from the majority of data points. Anomaly detection methods suffer from several wellknown challenges that hinder their... more
The increased use of real-time water quality monitoring using automated systems with sensors demands and makes it possible to identify unexpected values in time. Anomalies are brought by technical issues that are likely to prevent... more
The increased use of real-time water quality monitoring using automated systems with sensors demands and makes it possible to identify unexpected values in time. Anomalies are brought by technical issues that are likely to prevent... more
Detection of anomalies (outliers or abnormal instances) is an important element in a range of applications such as fault, fraud, suspicious behavior detection and knowledge discovery. In this article we propose a new method for anomaly... more
Let G be a bounded simply-connected domain in the complex plane C, whose boundary Γ := ∂G is a Jordan curve, and let {pn} ∞ n=0 denote the sequence of Bergman polynomials of G. This is defined as the sequence of polynomials that are... more
Let G be a bounded simply-connected domain in the complex plane C, whose boundary Γ := ∂ G is a Jordan curve, and let {p n } ∞ n=0 denote the sequence of Bergman polynomials of G. This is defined as the sequence of polynomials that are... more
Random Projection (RP) is a popular and efficient technique to preprocess high-dimensional data and to reduce its dimensionality. While RP has been widely used and evaluated in stationary data analysis scenarios, non-stationary... more
Random Projection (RP) is a popular and efficient technique to preprocess high-dimensional data and to reduce its dimensionality. While RP has been widely used and evaluated in stationary data analysis scenarios, non-stationary... more
An outlier ranking tree selection approach to extreme pruning of random forests. In Jayne, C. and Iliadis, L. (eds.) Communications in computer and information science, 629, Engineering applications of neural networks: proceedings of the... more
Due to unavailability of prediction system, the success rate of a liver transplant is subordinate. For optimal organ allocation, MELD score is used which follows the sickest first policy. In sickest first policy, the sickest patient gets... more
Outlier detection (OD) is a key problem, for which numerous solutions have been proposed. To deal with the difficulties associated with outlier detection across various domains and data characteristics, ensembles of outlier detectors have... more
Random Forest (RF) is an ensemble classification technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there... more
Today, failure modes characterization and early detection is a key issue in complex assets. This is due to the negative impact of corrective operations and the conservative strategies usually put in practice, focused on preventive... more
As machine learning and cybersecurity continue to explode in the context of the digital ecosystem, the complexity of cybersecurity data combined with complicated and evasive machine learning algorithms leads to vast difficulties in... more
In this paper, the impact of-means and local outliner factor on data set is studied. Outlier is the observation which is different from or inconsistent with the rest of the data. However, the main challenges of outlier detection are... more
We propose a deterministic initialization of the Echo State Network reservoirs to ensure that the activation of its internal echo state representations reflects similar topological qualities of the input signal which should lead to a... more
Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as... more
Outlier detection (OD) is a key problem, for which numerous solutions have been proposed. To deal with the difficulties associated with outlier detection across various domains and data characteristics, ensembles of outlier detectors have... more
Growing interest in data and analytics in education, teaching, and learning raises the priority for increased, high-quality research Data Mining is a technique used to find out possibly new information from huge amount of data.... more
Outlier Detection is a Data Mining Application. Outlier contains noisy data which is researched in various domains. The various techniques are already being researched that is more generic. We surveyed on various techniques and... more
This paper proposes a novel incremental modification to the Local Outlier Probabilities algorithm, which is commonly used for anomaly detection, to allow it to detect outliers nearly instantly in data streams. The proposed incremental... more
Internet of things (IoT) becomes the most popular term in the recent advances in Healthcare devices. The healthcare data in the IoT process and structure is very sensitive and critical in terms of healthy and technical considerations.... more
Wind energy integration research generally relies on complex sensors located at remote sites. The procedure for generating high-level synthetic information from databases containing large amounts of lowlevel data must therefore account... more
Outlier detection (OD) is a key problem, for which numerous solutions have been proposed. To deal with the difficulties associated with outlier detection across various domains and data characteristics, ensembles of outlier detectors have... more
Internet of things (IoT) becomes the most popular term in the recent advances in Healthcare devices. The healthcare data in the IoT process and structure is very sensitive and critical in terms of healthy and technical considerations.... more
Statistical re-sampling techniques have been used extensively and successfully in the machine learning approaches for generation of classifier and predictor ensembles. It has been frequently shown that combining so called unstable... more
This article investigates the effects of outliers on the coefficient of determination, R2 which is computed by Ordinary Least Squares (OLS) estimator. It is now evident that the OLS is greatly affected by outliers and hence the R2 is also... more
Outlier detection (OD) is a key problem, for which numerous solutions have been proposed. To deal with the difficulties associated with outlier detection across various domains and data characteristics, ensembles of outlier detectors have... more
Internet of things (IoT) becomes the most popular term in the recent advances in Healthcare devices. The healthcare data in the IoT process and structure is very sensitive and critical in terms of healthy and technical considerations.... more
In this paper, a novel method for detecting transformer winding axial displacement has been presented. In this method, which is based on UWB radar imaging, a UWB pulse is transmitted to the transformer winding and the reflection from it... more
Outlier detection (OD) is a key problem, for which numerous solutions have been proposed. To deal with the difficulties associated with outlier detection across various domains and data characteristics, ensembles of outlier detectors have... more
This paper investigates the use of an unsupervised hybrid statistical–local outlier factor algorithm to detect anomalies in time-series flight data. Flight data analysis is an activity carried out by airlines primarily as a means of... more
In large datasets, identifying exceptional or rare cases with respect to a group of similar cases is considered very significant problem. The traditional problem (Outlier Mining) is to find exception or rare cases in a dataset... more
The main objective of this paper is to design and developed a model of power system transmission lines fault identification using Local Outlier Factor (LOF) technique based on data mining. 9 bus power system and 30 bus power systems... more
Our research deals with intelligent decision support systems based on rule-based knowledge bases. Decision support systems use rules ”If a condition, then a decision” as a form of knowledge representation. In the process of inference,... more
The aim of the article is the analysis of using LOF, COF and Kmeans algorithms for outlier detection in rule based knowledge bases. The subject of outlier mining is very important nowadays. Outliers in rules mean unusual rules which are... more
The problem of outlier detection in univariate circular data was the object of increased interest over the last decade. New numerical and graphical methods were developed for samples from different circular probability distributions. The... more
The detection of outliers in text documents is a highly challenging task, primarily due to the unstructured nature of documents and the curse of dimensionality. Text document outliers refer to text data that deviates from the text found... more
One of the most significant security concerns confronting network technology is the detection of distributed denial of service (DDOS). This paper introduces a semi-supervised datadriven approach to the detection of DDOS attacks. The... more
Detecting anomalies over real-world datasets remains a challenging task. Data annotation is an intensive human labor problem, particularly in sequential datasets, where the start and end time of anomalies are not known. As a result, data... more
Download research papers for free!