Academia.eduAcademia.edu

Outline

A Scalable Platform for Monitoring Data Intensive Applications

2019, Journal of Grid Computing

https://doi.org/10.1007/S10723-019-09483-1

Abstract

Latest advances in information technology and the widespread growth in different areas are producing large amounts of data. Consequently, in the past decade a large number of distributed platforms for storing and processing large datasets have been proposed. Whether in development or in production, monitoring the applications running on these platforms is not an easy task, dedicated tools and platforms were proposed for this task yet none are specially designed for Big Data frameworks. In this paper we present a distributed, scalable, highly available platform able to collect, store, query and process monitoring data obtained from multiple Big Data frameworks. Alongside the architecture we experimentally show that the solution proposed is scalable and can handle a substantial quantity of monitoring data.

References (43)

  1. Martín Abadi, Ashish Agarwal, and Paul Barham et. al. Tensor- flow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
  2. Giuseppe Aceto, Alessio Botta, Walter de Donato, and Antonio Pescapè. Cloud monitoring: A survey. Computer Networks, 57(9):2093-2115, 2013.
  3. Khalid Alhamazani, Rajiv Ranjan, Karan Mitra, Fethi Rabhi, Prem Prakash Jayaraman, Samee Ullah Khan, Adnene Guabtni, and Vasudha Bhatnagar. An overview of the commercial cloud monitoring tools: Research dimensions, design issues, and state-of-the-art. Computing, 97(4):357-377, April 2015.
  4. Vaggelis Antypas, Nikos Zacheilas, and Vana Kalogeraki. Dy- namic reduce task adjustment for hadoop workloads. In Pro- ceedings of the 19th Panhellenic Conference on Informatics, PCI '15, pages 203-208, New York, NY, USA, 2015. ACM.
  5. Danilo Ardagna, Laurie-Anne PARANT, Ismael Torres, and et. al. Final assessment report and impact analysis (d6.4). Technical report, H2020 DICE, 2018.
  6. Matej Artac, Tadej Borovsak, Elisabetta Di Nitto, Michele Guer- riero, and Damian Andrew Tamburri. Model-driven continuous deployment for quality devops. In Proceedings of the 2nd Interna- tional Workshop on Quality-Aware DevOps, QUDOS@ISSTA 2016, Saarbrücken, Germany, July 21, 2016, pages 40-41, 2016.
  7. Luis Eduardo Bautista Villalpando, Alain April, and Alain Abran. Performance analysis model for big data applications in cloud computing. Journal of Cloud Computing, 3(1):1-20, 2014.
  8. Simona Bernardi, José Ignacio Requeno, Christophe Joubert, and Alberto Romeu. A systematic approach for performance evaluation using process mining: The posidonia operations case study. In Proceedings of the 2Nd International Workshop on Quality-Aware DevOps, QUDOS 2016, pages 24-29, New York, NY, USA, 2016. ACM.
  9. Marcello M. Bersani, Francesco Marconi, and Matteo Rossi. Trace checking of streaming applications through dice-tract. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, ICPE '18, pages 159-160, New York, NY, USA, 2018. ACM.
  10. Marcello M. Bersani, Francesco Marconi, Matteo Rossi, and Madalina Erascu. A tool for verification of big-data applications. In Proceedings of the 2Nd International Workshop on Quality-Aware DevOps, QUDOS 2016, pages 44-45, New York, NY, USA, 2016. ACM.
  11. Marcello Maria Bersani, Domenico Bianculli, Carlo Ghezzi, Sr dan Krstić, and Pierluigi San Pietro. Smt-based checking of soloist over sparse traces. In Stefania Gnesi and Arend Rensink, editors, Fundamental Approaches to Software Engineering, pages 276-290, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg.
  12. Mahantesh N. Birje and Sunilkumar S. Manvi. Wigrimma: A wireless grid monitoring model using agents. Journal of Grid Computing, 9(4):549-572, Dec 2011.
  13. Giuliano Casale, Danilo Ardagna, Matej Artac, and et. al. Dice: Quality-driven development of data-intensive cloud applica- tions. In 7th IEEE/ACM International Workshop on Modeling in Software Engineering, MiSE 2015, Florence, Italy, May 16-17, 2015, pages 78-83, 2015.
  14. Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey. ACM Comput. Surv., 41(3):15:1- 15:58, 2009.
  15. K Chodorow and M Dirolf. MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O'Reilly Media, 2010.
  16. D. N. Doan and G. Iuhasz. Tuning logstash garbage collection for high throughput in a monitoring platform. In 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pages 359-365, Sept 2016.
  17. Giuliano Casale et. al. D1.2 dice requirement specification. Technical report, 2016.
  18. M. Fowler. Microservice overview, Jan. 2016.
  19. Matthias Gander, Michael Felderer, Basel Katt, Adrian Tolbaru, Ruth Breu, and Alessandro Moschitti. Anomaly detection in the cloud: Detecting security incidents via machine learning. In Alessandro Moschitti and Barbara Plank, editors, Trustworthy Eternal Systems via Evolving Software, Data and Knowledge, volume 379 of Communications in Computer and Information Science, pages 103-116. Springer Berlin Heidelberg, 2013.
  20. Elmer Garduno, Soila P. Kavulya, Jiaqi Tan, Rajeev Gandhi, and Priya Narasimhan. Theia: Visual signatures for problem diagnosis in large hadoop clusters. In Proceedings of the 26th In- ternational Conference on Large Installation System Administration: Strategies, Tools, and Techniques, lisa'12, pages 33-42, Berkeley, CA, USA, 2012. USENIX Association.
  21. C. Gormley and Z. Tong. Elasticsearch: The Definitive Guide. O'Reilly Media, 2015.
  22. Miguel Grinberg. Flask Web Development: Developing Web Appli- cations with Python. O'Reilly Media, Inc., 1st edition, 2014.
  23. Robert L. Grossman, Stuart Bailey, Ashok Ramu, Balinder Malhi, Philip Hallstrom, Ivan Pulleyn, and Xiao Qin. The management and mining of multiple predictive models using the predictive modeling markup language. Information & Software Technology, 41(9):589-595, 1999.
  24. Juan Gutierrez-Aguado, Jose M. Alcaraz Calero, and Wladimiro Diaz Villanueva. Iaasmon: Monitoring architecture for pub- lic cloud computing data centers. Journal of Grid Computing, 14(2):283-297, Jun 2016.
  25. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data min- ing software: An update. SIGKDD Explor. Newsl., 11(1):10-18, November 2009.
  26. G. Iuhasz and I. Dragan. An overview of monitoring tools for big data and cloud applications. In 2015 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Com- puting (SYNASC), pages 363-366, Sept 2015.
  27. Gabriel Iuhasz and Daniel Pop. Monitoring and data warehous- ing tools -initial version. DICE EU H2020 Project Deliverable, 2016.
  28. Matthew Jacobs. Challenges and lessons learned building mon- itoring anddiagnostics tools for hadoop. In Proceedings of the 2012 Workshop on Management of Big Data Systems, MBDS '12, pages 33-34, New York, NY, USA, 2012. ACM.
  29. Sean Kennedy and Lin Jiu. Facilitating collaboration and in- teraction across the enterprise with oslc. In Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research, CASCON '13, pages 374-375, Riverton, NJ, USA, 2013. IBM Corp.
  30. A. Kertesz, G. Kecskemeti, M. Oriol, and title= et. al.
  31. J. Kreps, N. Narkhede, and J. Rao. Kafka: A distributed messag- ing system for log processing. In Proceedings of 6th International Workshop on Networking Meets Databases (NetDB), Athens, Greece, 2011.
  32. N. Marz and J. Warren. Big Data: Principles and Best Practices of Scalable Realtime Data Systems. Manning Publications, 2015.
  33. M. McCandless, E. Hatcher, and O. Gospodnetić. Lucene in Action. Manning Pubs Co Series. Manning, 2010.
  34. S. Newman. Building Microservices: Designing fine-grained Sys- tems. O'Reilly Media, Incorporated, 2015.
  35. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011.
  36. Swann Perarnau, Rajeev Thakur, Kamil Iskra, Ken Raffenetti, Franck Cappello, Rinku Gupta, Pete Beckman, Marc Snir, Henry Hoffmann, Martin Schulz, and Barry Rountree. Distributed Monitoring and Management of Exascale Systems in the Argo Project, pages 173-178. Springer International Publishing, Cham, 2015.
  37. Hesam Sagha, Hamidreza Bayati, José Del R. Millán, and Ri- cardo Chavarriaga. On-line anomaly detection and resilience in classifier ensembles. Pattern Recogn. Lett., 34(15):1916-1927, November 2013.
  38. Gabriele Santomaggio and Sigismondo Boschi. RabbitMQ cook- book. Packt Publ., Birmingham, 2013.
  39. J. Turnbull. The Logstash Book:. James Turnbull, 2013.
  40. J. Venner, S. Wadkar, and M. Siddalingaiah. Pro Apache Hadoop. Apress, 2014.
  41. X. Wu, Y. Liu, and I. Gorton. Exploring performance models of hadoop applications on cloud architecture. In 2015 11th International ACM SIGSOFT Conference on Quality of Software Architectures (QoSA), pages 93-101, May 2015.
  42. Alexis HENRY Youssef RIDENE. Legacy data migration and fraud detection using blu age and big data. Technical report, BluAge, 2015.
  43. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. Spark: Cluster computing with work- ing sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10-10, Berkeley, CA, USA, 2010. USENIX Association.