Academia.eduAcademia.edu

local outlier factor (LOF)

description232 papers
group0 followers
lightbulbAbout this topic
Local Outlier Factor (LOF) is an algorithm used in anomaly detection that identifies outliers in a dataset by measuring the local density deviation of a data point with respect to its neighbors. It quantifies how isolated a point is compared to its surrounding points, allowing for the detection of local anomalies.
lightbulbAbout this topic
Local Outlier Factor (LOF) is an algorithm used in anomaly detection that identifies outliers in a dataset by measuring the local density deviation of a data point with respect to its neighbors. It quantifies how isolated a point is compared to its surrounding points, allowing for the detection of local anomalies.

Key research themes

1. How can Local Outlier Factor (LOF) algorithms be adapted for scalable, real-time outlier detection in dynamic, high-dimensional data environments such as data streams and spatio-temporal data?

This research theme focuses on the challenges and methods of extending LOF algorithms—originally designed for static datasets—to handle the complexities of big data streams, multidimensional data, and spatio-temporal outliers. Key issues include computational scalability, concept drift responsiveness, and integrating spatial-temporal context for improved detection accuracy. Addressing these challenges enables LOF to operate effectively in domains requiring continuous anomaly monitoring such as network intrusion detection, sensor networks, and environmental monitoring.

Key finding: This paper systematically reviews LOF and its variants developed for data stream environments, highlighting the limitations of traditional LOF on static datasets when applied to streaming data due to concept drift and volume... Read more
Key finding: The authors introduce ST-BOF, a spatio-temporal extension to LOF, enabling simultaneous evaluation of spatial and temporal contexts. Their ST-BDBCAN and Approx-ST-BDBCAN algorithms leverage ST-BOF for clustering and outlier... Read more
Key finding: This work proposes an unsupervised online outlier detection technique for multi-dimensional data streams using Relative Neighbourhood Dissimilarity (ReND). ReND adaptively learns from streaming data under concept drift,... Read more
Key finding: The paper presents a novel sampling-based outlier detection method designed for large, high-dimensional datasets that combines randomized partitioning with efficient sampling to create a candidate outlier set. This approach... Read more

2. What methodologies and statistical measures optimize the robustness and accuracy of outlier detection in multivariate, skewed, and circular data distributions beyond classical LOF applications?

This research theme investigates adapting LOF and other related methods to specialized data types such as skewed multivariate distributions and circular data, emphasizing robustness to data characteristics that invalidate assumptions of classical methods. It explores combining LOF with statistical depth functions, adjusted boxplots, and alternative metrics for better detection performance in these non-standard data spaces critical to applications like directional data analysis, regression diagnostics, signal processing, and environmental measurements.

Key finding: This study compares the outlier detection capability of four robust outlyingness functions tailored for multivariate skew-normal distributions, noting that traditional Mahalanobis distance methods are insufficient for skewed... Read more
Key finding: The authors extend LOF to univariate circular data by mapping angular observations to bivariate Cartesian coordinates and applying LOF in this transformed space. This approach overcomes limitations of existing... Read more
Key finding: Focusing on univariate location-scale families including skewed distributions, this paper modifies traditional boxplot fences by employing semi-interquartile ranges and controlling false positive outlier rates over sample... Read more
Key finding: This paper proposes a combination of the distribution-free Harrell-Davis median estimator with robust covariance estimation to handle simultaneous cellwise and casewise multivariate outliers, improving classification accuracy... Read more

3. How can Local Outlier Factor (LOF) be integrated with advanced machine learning and deep learning techniques to improve anomaly detection in complex rule-based systems and cybersecurity?

This research avenue explores hybrid approaches that combine LOF with modern AI architectures including autoencoders, attention mechanisms, and clustering algorithms to optimize detection of anomalous patterns in rule-based knowledge bases, network security, and related domains. The focus is on enhancing feature representation, temporal dependency modeling, and leveraging unsupervised learning paradigms to complement LOF's density-based outlier scoring for improved precision and reduced false alarms.

Key finding: This research demonstrates the synergistic use of LOF and Self-Organizing Maps (SOM) to identify unusual or rare rules in rule-based knowledge bases, improving completeness and quality of decision support systems. LOF... Read more
Key finding: The paper presents a semi-supervised framework combining an attention-equipped GRU autoencoder with LOF for feature extraction and anomaly scoring in DDOS detection. The integration of temporal feature learning (via GRU and... Read more
Key finding: By incorporating the LOF algorithm within a hybrid machine learning and statistical framework, this study improves unsupervised anomaly detection in high-dimensional flight data monitoring. The methodology adapts LOF to... Read more
Key finding: This work enhances collaborative outlier detection by employing locality-sensitive hashing (LSH) ensemble methods combined with LOF principles, enabling mergeable and privacy-preserving model aggregation across decentralized... Read more

All papers in local outlier factor (LOF)

Removing objects that are noise is an important goal of data cleaning as noise hinders most types of data analysis. Most existing data cleaning methods focus on removing noise that is the result of low-level data errors that result from... more
Removing objects that are noise is an important goal of data cleaning as noise hinders most types of data analysis. Most existing data cleaning methods focus on removing noise that is the result of low-level data errors that result from... more
Gas sample is conditioned using sample handling system (SHS) to remove particulate matter and moisture content before sending it through Continuous Emission Monitoring (CEM) devices. The performance of SHS plays a crucial role in reliable... more
Removing objects that are noise is an important goal of data cleaning as noise hinders most types of data analysis. Most existing data cleaning methods focus on removing noise that is the result of low-level data errors that result from... more
Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and... more
Removing objects that are noise is an important goal of data cleaning as noise hinders most types of data analysis. Most existing data cleaning methods focus on removing noise that is the result of low-level data errors that result from... more
Removing objects that are noise is an important goal of data cleaning as noise hinders most types of data analysis. Most existing data cleaning methods focus on removing noise that is the result of low-level data errors that result from... more
Download research papers for free!