Elastic Infrastructure for Interactive Data Farming Experiments
2012, Procedia Computer Science
https://doi.org/10.1016/J.PROCS.2012.04.022Abstract
With the increasing availability of high performance computing power, new possibilities with respect to simulation and analysis become available and feasible. One of such methodologies is data farming, where large amounts of data are generated through simulation of several configurations from large parameter space and then analyzed for patterns. Unfortunately, the availability of versatile data farming systems is very limited and none of existing solutions allows integration with novel Cloud solutions. In this paper, we present our system, which is a flexible solution for running very large Data Farming experiments on both intranet clusters as well as on remote computational resources, including public Clouds. Another important aspect of the presented system is the support of interactive Data Farming experiments with online analysis of partial experiment results and experiment extending capability. Sample application of our system is present on military mission planning support scenario.
References (37)
- A. Szalay and J. Gray, 2020 Computing: Science in an exponential world, Nature 440, pp. 413-414, 2006, doi:10.1038/440413a.
- Hey, T., Tansley, S., and Tolle, K. M. (ed.) The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research, 2009
- Kryza, B., Krol, D., Wrzeszcz, M., Dutka, L., Kitowski, J., Interactive Cloud Data Farming Environment For Military Mission Planning Support, in: Computer Science Annual of AGH-UST, AGH Press, Krakow, in press.
- Horne, G. E., Schwierz, K.-P. Mason, S. J., Hill, R. R., Mnch, L., Rose, O., Jefferson, T. and Fowler, J. W., Data Farming around the world overview, WSC, 2008, pp. 1442-1447
- Horne, Gary E., Beyond Point Estimates: Operational Synthesis and Data Farming, in: Maneuver Warfare Science 2001, ed. Gary Horne and Mary Leonardi, Marine Corps Combat
- Koehler, M. T., Upton, S. and Tivnan, B. F., Clustered Computing With NetLogo And RepastJ: Beyond Chewing Gum and Duct Tape, In Proceedings of the Agent 2005 Development Command Publication, 2001, pp. 1-7.
- Tisue, S. and Wilensky, U., NetLogo: A simple environment for modeling complexity, in Ali Minai and Yaneer Bar-Yam, ed., 'Proceedings of the Fifth International Conference on Complex Systems ICCS 2004' , 2004, pp. 16-21 .
- Schwarz, G., Kunde, D. Urban, C. Agent based Modeling in Military Operations Research, A Case Study. In: 3rd Workshop on Agent based Simulation / C. Urban (ed.), SCS Society for Computer Simulation Int. ; Delft ; Erlangen ; Ghent ; San Diego, 2002, p. 55-60
- Kleijnen, J.P.C., Sanchez, S.M., Lucas, T.W., and Cioppa, T.M., A user's guide to the brave new world of designing simulation experiments, INFORMS Journal on Computing, 2005, 17(3), pp. 263-289.
- Cioppa, T.M., Lucas, T.W., Efficient nearly orthogonal and space-filling Latin hypercubes, Technometrics, 2007, 49(1): 45-55.
- Horne, G. E. and T. Meyer, 2010, Data farming and defense applications, MODSIM World Conference and Expo, "21st Century Decision- Making, The Art of Modeling and Simulation", Hampton Roads Convention Center, Hampton, VA, USA 13-15 October 2010
- Hluchy, L. Kvassay, M. Dlugolinsky, S. Schneider, B. Bracker, H. Kryza, B. Kitowski, J., Handling internal complexity in highly realistic agent-based models of human behaviour, proc. of 6th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI 2011), 2011, pp. 11-16
- Kvassay, M., Hluchy, L., Kryza, B., Kitowski, J., Seleng, M., Dlugolinsky, S., Laclavik, M., Combining object-oriented and ontology-based approaches in human behaviour modelling, proc. of IEEE 9th International Symposium on Applied Machine Intelligence and Informatics, 2011, Smolenice, Slovakia, pp. 177 182
- D. Thain, T. Tannenbaum, and M. Livny, "Distributed Computing in Practice: The Condor Experience" Concurrency and Computation: Practice and Experience, 17(2-4), pp. 323-356, February-April, 2005.
- Evangelinos, C., Hill, C. N., Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere-Ocean Climate Models on Amazon's EC2, Cloud Computing and Its Applications 2008, [online: www.cca08.org/papers/Paper34-Chris-Hill.pdf , as of December 27, 2011]
- Vecchiola, C., Pandey, S., Buyya, R., High-Performance Cloud Computing: A View of Scientific Applications, Proceedings of the 10th Inter- national Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN 2009, IEEE CS Press, USA), Kaohsiung, Taiwan, December 14-16, 2009.
- Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B. P., Maechling, P., Scientific Workflow Applications on Amazon EC2, Workshop on Cloud-based Services and Applications in conjunction with 5th IEEE International Conference on e-Science, Oxford UK, December 9-11, 2009.
- Amazon Elastic Compute Cloud website [on-line: http://aws.amazon.com/ec2, as of December 27, 2011].
- Atkinson, A. C., and Donev, A. N., Optimum Experimental Designs, Clarendon Press, Oxford. 1992.
- Nguyen, N., Miller, A. J., A review of some exchange algorithms for constructing discrete D-optimal designs, Computational Statistics & Data Analysis, 1992, 14(4), pp. 489-498.
- Ye, K. Q., Orthogonal column Latin hypercubes and their application in computer experiments, Journal of the American Statistical Association, 1998, 93 (444), pp. 14301439, doi:10.2307/2670057.
- Breiman, L., Friedman, J., Olshen, R., Stone, C., Classification and Regression Trees, Belmont, California: Wadsworth, 1984, ISBN: 0412048418.
- Bezek, A., Gams M., Bratko, I., Multi-agent strategic modeling in a robotic soccer domain, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, May 08-12, 2006, Hakodate, Japan.
- Sall, J., Creighton, L., Lehman, A., JMP Start Statistics: A Guide to Statistics and Data Analysis Using Jmp, Fourth Edition, SAS Press, 2007, ISBN-10: 159994572X.
- Ruby on Rails framework websize [on-line: http://http://rubyonrails.org/, as of December 27, 2011].
- Curran, J. M., Introduction to Data Analysis with R for Forensic Scientists, CRC Press, Boca Raton, FL, 2011, ISBN 9781420088267.
- Dahl, D. B., Scott Crawford, S., RinRuby: Accessing the R Interpreter from Pure Ruby, Journal for Statistical Software, 29(4), Jan 2009.
- Bolte, M., Sievers, M., Birkenheuer, G., Niehorster, O., Brinkmann, A., Non-intrusive Virtualization Management using libvirt, Design, Automation & Test in Europe Conference & Exhibition (DATE), 8-12 March 2010, pp. 574 -579, ISSN: 1530-159.
- Funika, W., Korcyl, K., Pieczykolan, J., Skital, L., Balos K., Slota R., Guzy K., Dutka L., Kitowski J., Zielinski K., Adapting a HEP Application for Running on the Grid, in: Computing and Informatics, 28(3)(2009)353-367.
- J. Marco, I. Campos, I. Coterillo, I. Diaz, A. Lopez, R. Marco, C. Martinez-Rivero, P. Orvis, D. Rodriguez, J. Gomes, G. Borges, M. Montecelo, M. David, B. Silva, N. Dias, J. P. Martins, C. Fernandez, L. Garcia-Tarres, C. Veiga, D. Cordero, J. Lopez Cacheiro, I. Lopez, J. Garcia-Tobio, N. Costas, J. C. Mourino, A. Gomez, W. Bogacki, N. Meyer, M. Owsiak, M. Plociennik, M. Pospieszny, M. Zawadzki, A. Hammad, M. Hardt, E. Fernandez, E. Heymann, M. A. Senar, A. Padee, K. Nawrocki, W. Wislicki, P. Heinzreiter, M. Baumgartner, H. Rosmanith, D. Kranzmuller, J. Volkert, S. Kenny, B. Coghlan, R. Pajak, Z. Mosurska, T. Szymocha, P. Lason, L. Skital, W. Funika, K. Korcyl, J. Pieczykolan, K. Balos, R. Slota, K. Guzy, L. Dutka, J. Kitowski, K. Zielinski, L. Hluchy, M. Dobrucky, B. Simo, O. Habala, J. Astalos, M. Ciglan, V. Sipkova, M. Babik, E. Gatial, R. Valles, J. M. Reynolds, F. Serrano, A. Tarancon, J. L. Velasco, F. Cstejon, K. Dichev, R. Keller, and S. Stork, The Interactive European GRID: Project Objecives and Achievements, Computing and Informatics, 27(2)(2008)161-171.
- Korcyl K., Szymocha T., Funika W., Kitowski J., Slota R., Balos K., Dutka L., Guzy K., Kryza T., Pieczykolan J., The ATLAS experiment on-line monitoring and filtering as an example of real-time application, Computer Science, 9(2008)77-86, AGH University Press.
- Luke, S., Cioffi-Revilla, C., Panait, L., Sullivan, K., Balan, G., MASON: A Multi-Agent Simulation Environment, In Simulation: Transactions of the society for Modeling and Simulation International, 2005, 82(7), pp. 517-527.
- Rackspace Cloud Servers, [on-line: http://www.rackspace.com/cloud/cloud hosting products/servers/, as of December 12, 2011]
- Network File System version 4 protocol specification [on-line: http://tools.ietf.org/html/rfc3530, as of December 27, 2011].
- D. Hildebrand, P. Honeyman, Direct-pNFS: scalable, transparent, and versatile access to parallel file systems, Proceedings of the 16th international symposium on High performance distributed computing, ACM New York, NY, USA, 2007.
- Krol, D., Kitowski, J., Distributed Storage Support in Private Clouds Based on Static Scheduling Algorithms, in: Proc. of CLOUD COM- PUTING 2011 The Second International Conference on Cloud Computing, GRIDs, and Virtualization, November 25-30, 2011 -Rome, Italy, IARIA, 2011, pp. 141-146.
- Krol, D., Slota, R., Funika. W., Behaviour-inspired Data Management in the Cloud, in: Proc. of CLOUD COMPUTING 2010 The First International Conference on Cloud Computing, GRIDs, and Virtualization, November 21-26, 2010 -Lisbon, Portugal, IARIA, 2010, pp. 98-103