Anomaly detection using real-valued negative selection

Gurpreet Singh

doi:10.1023/A:1026195112518

Outline

Anomaly detection using real-valued negative selection

Gurpreet Singh

2003, Genetic Programming and Evolvable …

https://doi.org/10.1023/A:1026195112518

visibility

…

description

26 pages

link

1 file

Abstract

This paper describes a real-valued representation for the negative selection algorithm and its applications to anomaly detection. In many anomaly detection applications, only positive (normal) samples are available for training purpose. However, conventional classification algorithms need samples for all classes (e.g. normal and abnormal) during the training phase. This approach uses only normal samples to generate abnormal samples, which are used as input to a classification algorithm. This hybrid approach is compared against an anomaly detection technique that uses self-organizing maps to cluster the normal data sets (samples). Experiments are performed with different data sets and some results are reported.

FAQs

What advantages does real-valued negative selection offer over binary representations?add

The research demonstrates that real-valued negative selection enables a more meaningful representation of data, facilitating better integration with machine learning algorithms. This was evidenced by improvements in anomaly detection accuracy in varying applications.

How does the hybrid neuro-immune system compare with traditional methods?add

The hybrid system (HNIS) achieved a detection rate of 98% on the MIT-Darpa 99 dataset, surpassing traditional binary negative selection methods. In contrast, positive detection algorithms also performed well, reinforcing the versatility of HNIS.

What were the key parameters for optimizing the real-valued negative selection algorithm?add

Parameters such as radius (r = 0.1) and adaptation rate (η = 1) were crucial for detector generation. These settings contributed to O(num iter • num ab • (num ab + |S|)) time complexity, establishing effective detector distribution in the non-self space.

What dataset complexities were addressed in the experiments?add

Experiments utilized complex datasets like the Mackey-Glass time series, exhibiting chaotic behavior, and the KDD Cup 99 dataset, featuring real network attacks, both presenting significant challenges for reliable anomaly detection. This diversity tested the robustness of the proposed methods across varying circumstances.

What was the performance outcome of the SOM compared to HNIS?add

Self-Organizing Maps (SOM) exhibited slightly better results in some scenarios, achieving a detection rate exceeding 93% with low false alarms. However, HNIS demonstrated comparable performance, especially when detecting non-self samples, validating its efficacy.

Figures (12)

Figure 1. Illustrates an iteration of the real-valued negative selection algorithm. The input to the algorithm is a set of self samples represented by n-dimensional points (vectors). The algorithm tries to evolve a complement set of points (called antibodies or detectors) that cover the non-self space. This is accomplished by an iterative process that updates the position of the detector driven by two goals!:

(feature vectors), which are used by the RNS algorithm [20] to generate abnormal samples. Subsequently, the normal and abnormal samples are used as input to a supervised algorithm that produces a classifier. This classifier corresponds to the anomaly detection function and is used during the testing phase to classify new samples as normal or abnormal. Figure 3. A hybrid immune system for anomaly detection that generates an anomaly characterization function from normal samples.

Figure 4. Mackey-Glass time series: (a) normal, using tT = 30, (b) with an anomaly, T = 17 from 300 to 400. The normal samples were produced from a time series with 500 elements generated using 7 = 30 and discarding the first 1000 samples to eliminate the initial value effect. The resulting time series is shown in Figure 4(a). The test data (Figure 4(b)) is generated as before using 7 = 30, but starting with different initial conditions. An abnormality is introduced between time 300 to 400 by changing the parameter 7 to 17. It is important to note that this experimental setting is different from the one used by Dasgupta and Forrest [9]. In that work, the anomalous time series is identical to the normal one, with the exception of the portion between 1000 and 1500. In our case, the two series are completely

Figure 5. Output value produced by the anomaly function when applied to the Mackey-Glass testing set. (a) HNIS (12 hidden neurons); (b) BNS (r = 8, Gray coding); (C) SOM (6x6, Doo distance).

Figure 6. Output value, smoothed using Equation 8, produced by the anomaly func- tion when applied to the Mackey-Glass testing set. (a) HNIS (12 hidden neurons, s = 5); (b) BNS (r = 8, Gray coding, s = 10); (C) SOM (6x6, , Doo distance, s = 10).

Figure 7. ROC curves for the BNS algorithm applied to the Mackey-Glass test data set. 5.1.3.2. SOM results As it was discussed in section 4, three different distance measures were proposed to calculate the anomaly detection function defined in Equation 2. Figure 8(a) shows the ROC curves corresponding to these distance measures. D,, Minkowsky distance (Equation 6) shows a slight advantage over other distance measures. Figure 8(b) shows ROC curves for different topologies of the SOM network. A higher number of neurons produces a most accurate clas- sification; however, the difference between the curves is not big; this suggests that a further increase in the network complexity may not improve the accuracy.

Figure 8. ROC curves for SOM anomaly detection applied to Mackey-Glass test data set. (a) different distance measures using 6x6 topology; (b) different topologies using D. distance.

Figure 9. ROC curves for HNIS anomaly detection applied to Mackey-Glass test data set for different MLP topologies: 6, 12, and 16 hidden neurons.

Figure 10. Best ROC curves produced by each method for the Mackey-Glass test data set.

Figure 11. Best ROC curves produced by HNIS and SOM methods for the MIT-Darpa 98 test data set.

Figure 12. Best ROC curves produced by each method for Darpa 99 test data set. This data set correspond to a breast cancer data set created at the University of Wisconsin Hospitals [36]. This particular data set was obtained from the University of Irvine Machine Learning repository*. Each data record is conformed by ten numerical attributes and the abel (benign or malign). The data is composed by 699 records, but 16 of them have missing values. (we did not use these records.) The data was normalized to fit the interval [0,1] , and we partitioned it in two sets, training and testing. The training set contains 271 benign records. The testing set is composed of 412 mixed benign and malign records.

Figure 18. Best ROC curves produced by each method for Wisconsin breast cance test data set.

References (38)

Ayara, M., J. Timmis, L. de Lemos, R. de Castro, and R. Duncan: 2002, 'Neg- ative selection: How to generate detectors'. In: J. Timmis and P. J. Bentley (eds.): Proceedings of the 1st International Conference on Artificial Immune Systems (ICARIS). Canterbury, UK, pp. 89-98.
Balthrop, J., F. Esponda, S. Forrest, and M. Glickman: 2002a, 'Coverage And Generalization In An Artificial Immune System'. In: W. B. Langdon, E. Cantú- Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. GPEM2003resubm.tex; 8/04/2004; 10:47; p.22
Rudolph, J. Wegener, L. Bull, M. A. Potter, A. C. Schultz, J. F. Miller, E. Burke, and N. Jonoska (eds.): Proceedings of the Genetic and Evolutionary Computation Conference (GECCO). San Francisco, CA, pp. 3-10.
Balthrop, J., S. Forrest, and M. R. Glickman: 2002b, 'Revisting LISYS: Pa- rameters and Normal Behavior'. In: D. B. Fogel, M. A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, and M. Shackleton (eds.): Proceedings of the 2002 Congress on Evolutionary Computation CEC2002. USA, pp. 1045-1050.
Bradley, D. and A. Tyrrell: 2002, 'Immunotronics: Novel Finite-State-Machine Architectures with Built-In Self-Test Using Self-Nonself Differentiation'. IEEE Transactions on Evolutionary Computation 6(3), 227-238.
Caudell, T. and D. Newman: 1993, 'An adaptive resonance architecture to define normality and detect novelties in time series and databases'. In: IEEE World Congress on Neural Networks. Portland, OR, pp. 166-176.
Coello Coello, C. A. and N. Cruz Cortés: 2002, 'An Approach to Solve Multi- objective Optimization Problems Based on an Artificial Immune System'. In: J. Timmis and P. J. Bentley (eds.): First International Conference on Artificial Immune Systems (ICARIS). Canterbury,UK, pp. 212-221.
Dagupta, D. and F. González: 2002, 'An Immunity-Based Technique to Charac- terize Intrusions in Computer Networks'. IEEE Transactions on Evolutionary Computation 6(3), 281-291.
Dasgupta, D.: 1999, Artificial immune systems and their applications. New York: Springer-Verlag.
Dasgupta, D. and S. Forrest: 1996, 'Novelty detection in time series data using ideas from immunology'. In: J. F. C. Harris (ed.): Proceedings of the 5th International Conference on Intelligent Systems. Cary, NC, pp. 82-87.
Dasgupta, D. and S. Forrest: 1999, 'An anomaly detection algorithm inspired by the immune system'. In: D. Dasgupta (ed.): Artificial immune systems and their applications,. New York: Springer-Verlag, pp. 262-277.
Dasgupta, D. and N. S. Majumdar: 2002, 'Anomaly Detection in Multidimen- sional Data using Negative Selection Algorithm'. In: D. B. Fogel, M. A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, and M. Shackle- ton (eds.): Proceedings of the 2002 Congress on Evolutionary Computation (CEC2002). USA, pp. 1039-1044.
de Castro, L. N. and J. Timmis: 2002, Artificial Immune Systems: A New Computational Approach. London, UK: Springer-Verlag.
Denning, D. E.: 1987, 'An intrusion-detection model'. IEEE Transactions on Software Engineering 13(2), 222-232.
D'haeseleer, P., S. Forrest, and P. Helman: 1996, 'An immunological approach to change detection: algorithms, analysis and implications'. In: J. McHugh and G. Dinolt (eds.): Proceedings of the 1996 IEEE Symposium on Computer Security and Privacy. USA, pp. 110-119.
Fan, W., W. Lee, M. Miller, S. Stolfo, and P. Chan: 2001, 'Using artificial anomalies to detect unknown and known network intrusions'. In: N. Cer- cone, T. Y. Lin, and X. Wu (eds.): Proceedings of the 1st IEEE International conference on Data Mining. USA, pp. 123-130.
Forrest, S., A. Perelson, L. Allen, and R. Cherukuri: 1994, 'Self-nonself dis- crimination in a computer'. In: Proceedings IEEE Symposium on Research in Security and Privacy. Los Alamitos, CA, pp. 202-212.
Fox, K., R. Henning, J. Reed, and R. Simonian: 1990, 'A neural network approach towards intrusion detection'. In: Proc. 13th NIST-NCSC national computer security conference. Washington, DC, pp. 125-134. GPEM2003resubm.tex; 8/04/2004; 10:47; p.23
González, F. and D. Dasgupta: 2002, 'An imunogenetic technique to detect anomalies in network traffic'. In: W. B. Langdon, E. Cantú-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M. A. Potter, A. C. Schultz, J. F. Miller, E. Burke, and N. Jonoska (eds.): Proceedings of the Genetic and Evolutionary Computation Conference (GECCO). San Francisco, CA, pp. 1081-1088.
Gonzalez, F., D. Dasgupta, and J. Gomez: 2003, 'The Effect of Binary matching Rules in Negative Selection'. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO).
González, F., D. Dasgupta, and R. Kozma: 2002, 'Combining negative selection and classification techniques for anomaly detection'. In: D. B. Fogel, M. A. El- Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, and M. Shackleton (eds.): Proceedings of the 2002 Congress on Evolutionary Computation CEC2002. USA, pp. 705-710.
Harmer, P., G. Williams, P.D.and Gnusch, and G. Lamont: 2002, 'An Artifi- cial Immune System Architecture for Computer Security Applications'. IEEE Transactions on Evolutionary Computation 6(3), 252-280.
Haykin, S.: 1994, Neural networks : a comprehensive foundation. New York: Macmillan.
Hofmeyr, S. and S. Forrest: 2000, 'Architecture for an Artificial Immune System'. Evolutionary Computation 8(4), 443-473.
Hsu, W., L. Auvil, W. Pottenger, D. Tcheng, and M. Welge: 1999, 'Self- organizing systems for knowledge discovery in databases'. In: In proceedings of the international joint conference on neural networks IJCNN-99. USA.
Keogh, E., S. Lonardi, and B. Chiu: 2002, 'Finding surprising patterns in a time series database in linear time and space'. In: O. R. Zaïane, R. Goebel, D. Hand, D. Keim, and R. Ng (eds.): Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '02). USA, pp. 550-556.
Kephart, J. O.: 1994, 'A Biologically Inspired Immune System for Computers'. In: R. A. Brooks and P. Maes (eds.): Proceedings of the 4th International Work- shop on the Synthesis and Simulation of Living Systems Artif icialLif eIV . Cambridge, MA, USA, pp. 130-139.
Kim, J. and P. Bentley: 2001, 'An Evaluation of Negative Selection in an Artificial Immune System for Network Intrusion Detection'. In: L. Spector, E. D. Goodman, A. Wu, W. B. Langdon, H.-M. Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M. H. Garzon, and E. Burke (eds.): Proceedings of the Genetic and Evolutionary Computation Conference (GECCO). San Francisco, CA, pp. 1330-1337.
Kohonen, T.: 1995, Self-Organizing Maps, Vol. 30 of Springer Series in In- formation Sciences. Berlin, Heidelberg: Springer. (Second Extended Edition 1997).
Lane, T.: 2000, 'Machine learning techniques for the computer security'. Ph.D. thesis, Purdue University.
Lee, W. and S. Stolfo: 1998, 'Data mining approaches for intrusion detection'. In: Proceedings of the 7th USENIX security symposium. Berkeley, CA, pp. 79- 94.
Mackey, M. and L. Glass: 1977, 'Oscillation and chaos in physiological control systems'. Science 197, 287-289.
MIT: 1999, '1999 Darpa intrusion detection evaluation'. MIT Lincoln Labs.
Murphy, P. and D. Aha: 1992, 'UCI Repository of machine learning databases'. GPEM2003resubm.tex; 8/04/2004; 10:47; p.24
Portnoy, L., E. Eskin, and S. Stolfo: 2001, 'Intrusion detection with unlabeled data using clustering'. In: Proceedings of ACM CCS Workshop on Data Mining Applied to Security. USA.
Provost, F., T. Fawcett, and R. Kohavi: 1998, 'The case against accuracy esti- mation for comparing induction algorithms'. In: J. Shavlik (ed.): Proceedings of 15th International Conference on Machine Learning. San Francisco, CA, pp. 445-453.
Wolberg, W. H. and O. Mangasarian: 1990, 'Multisurface method of pattern separation for medical diagnosis applied to breast cytology'. Proceedings of the National Academy of Sciences, U.S.A. 87, 9193-9196.
Yoshikiyo, T.: 2001, 'Fault detection by mining association rules from house- keeping data'. In: proceedings of international symposium on artificial intelli- gence, robotics and automation in space (i-sairas 2001). Montreal, Canada. GPEM2003resubm.tex; 8/04/2004; 10:47; p.25

Anomaly detection using real-valued negative selection

Sign up for access to the world's latest research

Abstract

FAQs

Related papers

References (38)

Related papers

Related topics

Cited by