Random Projection in supervised non-stationary environments
2020, The European Symposium on Artificial Neural Networks
Abstract
Random Projection (RP) is a popular and efficient technique to preprocess high-dimensional data and to reduce its dimensionality. While RP has been widely used and evaluated in stationary data analysis scenarios, non-stationary environments are not well analyzed. In this paper we provide a profound evaluation of RP on streaming data. We discuss how RP can be bounded for streaming data using the Johnson-Lindenstrauss (JL) lemma. In particular we analyze the effect of concept drift, as a key challenge for streaming data. We also provide experiments with RP on streaming data, using state-of-the-art streaming classifiers like Adaptive Hoeffding Tree, to evaluate its efficiency.
References (15)
- Achlioptas, D.: Database-friendly random projections: Johnson-lindenstrauss with binary coins. Journal of computer and System Sciences 66(4), 671-687 (2003)
- Aggarwal, C.C.: A survey of stream classification algorithms. In: Data Classification: Algorithms and Applications (2014)
- Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Advances in Intelligent Data Analysis VIII, IDA 2009. Proceedings. Lecture Notes in Computer Science, vol. 5772, pp. 249-260. Springer (2009)
- Bingham, E., Mannila, H.: Random projection in dimensionality reduction: Applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD. pp. 245-250. KDD '01, ACM, New York, USA (2001)
- Carraher, L.A., Wilsey, P.A., Moitra, A., Dey, S.: Random projection clustering on streaming data. In: 2016 IEEE 16th ICDMW. pp. 708-715 (2016)
- Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Computing Surveys 46(4), 1-37 (2014)
- Grabowska, M., Kot lowski, W.: Online principal component analysis for evolving data streams. In: Computer and Information Sciences. pp. 130-137. Springer (2018)
- Heusinger, M., Raab, C., Schleif, F.M.: Passive concept drift handling via momentum based robust soft learning vector quantization. In: Advances in SOM, LVQ, Clustering and Data Visualization. pp. 200-209. Springer (2020)
- Johnson, W.B., Lindenstrauss, J.: Extensions of lipschitz mappings into a hilbert space. Contemporary mathematics 26(1), 189-206 (1984)
- Li, P., Hastie, T.J., Church, K.W.: Very sparse random projections. In: Proceedings of the 12th ACM SIGKDD. pp. 287-296. ACM (2006)
- Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. Proc. -IEEE, ICDM pp. 291-300 (2017)
- Pham, X.C., Dang, M.T., Dinh, S.V., Hoang, S., Nguyen, T.T., Liew, A.W.: Learning from data stream based on random projection and hoeffding tree classifier. In: 2017 International Conference on Digital Image Computing (DICTA). pp. 1-8 (2017)
- Raab, C., Heusinger, M., Schleif, F.M.: Reactive soft prototype computing for frequent reoccurring concept drift. In: Proc. of the 27. ESANN. pp. 437-442 (2019)
- Sacha, D., Zhang, L., Sedlmair, M., Lee, J.A., Peltonen, J., Weiskopf, D., North, S.C., Keim, D.A.: Visual interaction with dimensionality reduction: A structured literature analysis. IEEE Trans. Vis. Comput. Graph. 23, 241-250 (2017)
- ESANN 2020 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Online event, 2-4 October 2020, i6doc.com publ., ISBN 978-2-87587-074-2. Available from http://www.i6doc.com/en/.