The speedup that can be achieved with parallel and distributed architectures is limited at least ... more The speedup that can be achieved with parallel and distributed architectures is limited at least by two laws: the Amdahl’s and Gustafson’s laws. The former limits the speedup to a constant value when a fixed size problem is executed on a multiprocessor, while the latter limits the speedup up to its linear value for the fixed time problems, which means that it is limited by the number of used processors. However, a superlinear speedup can be achieved (speedup greater than the number of used processors) due to insufficient memory, while, parallel and, especially distributed systems can even slowdown the execution due to the communication overhead, when compared to the sequential one. Since the cloud performance is uncertain and it can be influenced by available memory and networks, in this paper we investigate if it follows the same speedup pattern as the other traditional distributed systems. The focus is to determine how the elastic cloud services behave in the different scaled envi...
Pretty much every part of life now results in the generation of data. Logs are documentation of e... more Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of these records. Log data often grows quickly and the conventional database solutions run short for dealing with a large volume of log files. Hadoop, having a wide area of applications for Big Data analysis, provides a solution for this problem. In this study, Hadoop was installed on two virtual machines. Log files generated by a Python script were analyzed in order to evaluate the system activities. The aim was to validate the importance of Hadoop in meeting the challenge of dealing with Big Data. The performed experiments show that analyzing logs with Hadoop MapReduce makes the data processing and detection of malfunctions and defects faster and simpler. Keywords— Hadoop, MapReduce, Big Data, log analysis, distributed file systems.
Query Processing and Access Methods for Big Astro and Geo Databases
In spite of their development in different communities, either astro-informatics or geo-informati... more In spite of their development in different communities, either astro-informatics or geo-informatics, data management and analytics of astronomical and geospatial data share the same characteristics, and raise the same challenges when it comes to access, query, or analysis of the spatial features over Big Data. The very first challenge is to deal with the data volume, which is tremendous in many geo and astro datasets. In this chapter, we highlight their main specificity and outline the main steps of query processing in big geospatial and astronomical data servers. Through the review of the state of the art, we show the advance in the topic of Big Data management in both contexts of geospatial and sky surveying, while highlighting their similarity. This progress notwithstanding, several issues remain to deal with the variety (such as multidimensional arrays) of the data.
Pretty much every part of life now results in the generation of data. Logs are documentation of e... more Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of these records. Log data often grows quickly and the conventional database solutions run short for dealing with a large volume of log files. Hadoop, having a wide area of applications for Big Data analysis, provides a solution for this problem. In this study, Hadoop was installed on two virtual machines. Log files generated by a Python script were analyzed in order to evaluate the system activities. The aim was to validate the importance of Hadoop in meeting the challenge of dealing with Big Data. The performed experiments show that analyzing logs with Hadoop MapReduce makes the data processing and detection of malfunctions and defects faster and simpler.
Original scientific article High performance routers are fundamental building blocks of the syste... more Original scientific article High performance routers are fundamental building blocks of the system wide interconnection networks for high performance computing systems. Through collective interaction they provide reliable communication between the computing nodes and manage the communicational dataflow. The development process of specialized router architecture has high complexity and it requires many factors to be considered. The architecture of the highperformance routers is highly dependent on the flow control mechanism, as it dictates the way in which the packets are transferred through the network. In this paper novel high-performance "Step-Back-On-Blocking" router architecture has been proposed.
Extreme scale parallel computing systems will have tens of thousands of optionally accelerator-eq... more Extreme scale parallel computing systems will have tens of thousands of optionally accelerator-equipped nodes with hundreds of cores each, as well as deep memory hierarchies and complex interconnect topologies. Such exascale systems will provide hardware parallelism at multiple levels and will be energy constrained. Their extreme scale and the rapidly deteriorating reliability of their hardware components means that exascale systems will exhibit low meantime -betweenfailure values. Furthermore, existing programming models already require heroic programming and optimization efforts to achieve high efficiency on current supercomputers. Invariably, these efforts are platform-specific and non-portable. In this article, we explore the shortcomings of existing programming models and runtimes for large-scale computing systems. We propose and discuss important features of programming paradigms and runtimes to deal with exascale computing systems with a special focus on data-intensive applications and resilience. Finally, we discuss code sustainability issues and propose several software metrics that are of paramount importance for code development for ultrascale computing systems.
Development of highly parallel “Beowulf” cluster and optimization of MPI based implementation of the N-Queens problem with regards to the architecture of the system
Step-Back-On-Blocking Flow Control Mechanism for High Performance Interconnection Networks
the flow control mechanism is crucial element for achieving low communicational latency and high ... more the flow control mechanism is crucial element for achieving low communicational latency and high utilization of the interconnection network resources. A well-designed flow control should balance the communicational load in respect to both uniform and non-uniform traffic patterns. It significantly determines the communication performance of the interconnection network. In this paper we suggest efficient Step-Back-on-Blocking flow control intended for high performance interconnection networks. For implementing the proposed flow control adequate switch architecture was designed. The efficiency and behavior of the flow control have been verified on the basis of various simulation experiments conducted in the discrete event OMNet++ simulator environment.
An effective resource utilization of the modern high performance computing (HPC) systems is a sub... more An effective resource utilization of the modern high performance computing (HPC) systems is a subject for many scientific research investigations. In this paper a new generation programming languages and models for high performance computing systems have been systematized and presented. The motives for development of modern methods for multithreading and vector parallelization have been analyzed. Attention is focused on the four main attributes of the modern high performance computing systems: performance, programmability, portability and robustness, because they are fundamental for developing of new generation of programming languages.
Uploads
Papers by Atanas Hristov