Panda
Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems
https://doi.org/10.1145/3524860.3543281…
2 pages
1 file
Sign up for access to the world's latest research
Abstract
Distributed Stream Processing (DSP) systems highly rely on parallelism mechanisms to deliver high performance in terms of latency and throughput. Yet the development of such parallel systems altogether comes with numerous challenges. In this paper, we focus on how to select appropriate resources for parallel stream processing under the presence of highly dynamic and unseen workloads. We present PANDA that provides a novel learned approach for highly efficient and parallel DSP systems. The main idea is to provide accurate resource estimates and hence optimal parallelism degree using zero-shot cost models to ensure the performance demands. CCS CONCEPTS • Computer systems organization → Real-time systems.
Related papers
Indian Journal of Computer Science and Engineering
Present study reports the theoretical aspects of big data stream processing architecture and it's performance metric identification, performance optimization and related experiments. The proposed architecture is considered as a complex directed graph model with various real time computational elements as nodes and big data tuples as edges, forming a real time topology. The notions of hard time deadline bound computation on streaming big data tuple along with minimum performance guarantee of processing every tuple have been introduced in the present research. Time bound computation issues in the real time stream computing architecture have been improved by optimizing time deadline management through task forking models. An algorithm for optimization of throughput has been reported and the performance metrics of the proposed system has also been identified for proper analysis on the basis of queuing theories by expressing it through appropriate Kendalls notation. Experimental results, at the end, report considerable improvement of performance of the architecture by applying optimization algorithms on a standard dataset.
ABSTRACT Stream processing systems are becoming increasingly important to analyse real-time data generated by modern applications such as online social networks. Their main characteristic is to produce a continuous stream of fresh results as new data are being generated at real-time. Resource provisioning of stream processing systems is difficult due to time-varying workload data that induce unknown resource demands over time.
2014 International Conference on High Performance Computing & Simulation (HPCS), 2014
Data Stream Processing (DaSP) is a recent and highly active research field, applied in various real world scenarios. Differently than traditional applications, input data is seen as transient continuous streams that must be processed “on the fly”, with critical requirements on throughput, latency and memory occupancy. A parallel solution is often advocated, but the problem of designing and implementing high throughput and low latency DaSP applications is complex per se and because of the presence of multiple streams characterized by high volume, high velocity and high variability. Moreover, parallel DaSP applications must be able to adapt themselves to data dynamics in order to satisfy desired QoS levels. The aim of our work is to study these problems in an integrated way, providing to the programmers a methodological framework for the parallelization of DaSP applications.
Proceedings of the 22nd International Middleware Conference: Doctoral Symposium, 2021
Resource management in Distributed Stream Processing Systems (DSPS) defines the way queries are deployed on in-network resources to deliver query results while fulfilling the Quality of Service (QoS) requirements of the end-users. Various resource management mechanisms have been proposed in DSPS; however, they become inefficient in challenging conditions imposed by the dynamic environment and heterogeneous resources. This is because they focus on pre-configuration of both single and static QoS requirements. In addition, they lack cooperation between heterogeneous resources which amplify the problem of coordination between resources. This could lead to severe performance degradation such as inconsistent and incorrect query results in comparison to homogeneous resources. To solve the above challenges, in this research work, we will propose mechanisms: (i) to forecast the performance of network and heterogeneous resources, (ii) to select an efficient resource management approach, and (iii) for cooperation between resources in a dynamic environment. CCS CONCEPTS • Information systems → Stream management.
2007 IEEE International Parallel and Distributed Processing Symposium, 2007
In today's world, stream processing systems have become important, as applications like media broadcasting, sensor network monitoring and on-line data analysis increasingly rely on real-time stream processing. In this paper, we propose a distributed stream processing system that composes stream processing applications dynamically, while meeting their rate demands. Our system consists of the following components: (1) a distributed component discovery algorithm that discovers components available at nodes on demand, (2) resource monitoring techniques to maintain current resource availability information, (3) a scheduling algorithm that schedules application execution, and (4) a minimum cost composition algorithm that composes applications dynamically based on component and resource availability and scheduling demands. Our detailed experimental results, over the PlanetLab testbed, demonstrate the performance and efficiency of our approach.
IEEE Access, 2021
More and more use cases require fast, accurate, and reliable processing of large volumes of data. To do this, a distributed stream processing framework is needed which can distribute the load over several machines. In this work, we study and benchmark the scalability of stream processing jobs in four popular frameworks: Flink, Kafka Streams, Spark Streaming, and Structured Streaming. Besides that, we determine the factors that influence the performance and efficiency of scaling processing jobs with distinct characteristics. We evaluate horizontal, as well as vertical scalability. Our results show how the scaling efficiency is impacted by many factors including the initial cluster layout and direction of scaling, the pipeline design, the framework design, resource allocation, and data characteristics. Finally, we give some recommendations on how practitioners should undertake to scale their clusters.
IEEE Transactions on Parallel and Distributed Systems, 2021
Nowadays, we are witnessing the diffusion of Stream Processing Systems (SPSs) able to analyze data streams in near realtime. Traditional SPSs like STORM and FLINK target distributed clusters and adopt the continuous streaming model, where inputs are processed as soon as they are available while outputs are continuously emitted. Recently, there has been a great focus on SPSs for scale-up machines. Some of them (e.g., BRISKSTREAM) still use the continuous model to achieve low latency. Others optimize throughput with batching approaches that are, however, often inadequate to minimize latency for live-streaming applications. Our contribution is to show a novel software engineering approach to design the runtime system of SPSs targeting multicores, with the aim of providing a uniform solution able to optimize throughput and latency. The approach has a formal nature based on the assembly of components called building blocks, whose composition allows optimizations to be easily expressed in a compositional manner. We use this methodology to build a new SPS called WINDFLOW. Our evaluation showcases the benefits of WINDFLOW: it provides lower latency than SPSs for continuous streaming, and can be configured to optimize throughput, to perform similarly and even better than batch-based scale-up SPSs.
2011
Abstract Stream processing systems must handle stream data coming from real-time, high-throughput applications, for example in financial trading. Timely processing of streams is important and requires sufficient available resources to achieve high throughput and deliver accurate results. However, static allocation of stream processing resources in terms of machines is inefficient when input streams have significant rate variations-machines remain under-utilised for long periods of average load.
Journal of Parallel and Distributed Computing
Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink [8] Spark streaming [73]. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.
IEEE Access, 2018
Parallelizing and optimizing codes for recent multi-/many-core processors have been recognized to be a complex task. For this reason, strategies to automatically transform sequential codes into parallel and discover optimization opportunities are crucial to relieve the burden to developers. In this paper, we present a compile-time framework to (semi) automatically find parallel patterns (Pipeline and Farm) and transform sequential streaming applications into parallel using GrPPI, a generic parallel pattern interface. This framework uses a novel pipeline stage-balancing technique which provides the code generator module with the necessary information to produce balanced pipelines. The evaluation, using a synthetic video benchmark and a real-world computer vision application, demonstrates that the presented framework is capable of producing parallel and optimized versions of the application. A comparison study under several thread-core oversubscribed conditions reveals that the framework can bring comparable performance results with respect to the Intel TBB programming framework. INDEX TERMS Refactoring framework, automatic parallelization, load-balanced pipeline, parallel patterns.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (7)
- S. Frischbier, M. Paic, A. Echler, and C. Roth, łManaging the complexity of pro- cessing financial data at scale -an experience report, ž in CSDM, 2019, pp. 14ś26.
- X. Jiang, łBlink: How Alibaba Uses Apache Flink,ž https://www.ververica.com/ blog/blink-flink-alibaba-search, 2021, [Online; accessed 27-05-2022].
- X. Liu and R. Buyya, łResource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions, ž in CSUR, vol. 53, pp. 1ś41, 2020.
- R. Mayer, B. Koldehofe, and K. Rothermel, łPredictable low-latency event detection with parallel complex event processing, ž in IEEE IoTJ, vol. 2, pp. 274ś286, 2015.
- V. Cardellini, F. L. Presti, M. Nardelli, and G. R. Russo, łDecentralized self- adaptation for elastic data stream processing, ž in FGCS, vol. 87, pp. 171ś185, 2018.
- R. Heinrich, M. Luthra, H. Kornmayer, and C. Binnig, łZero-shot cost models for distributed stream processing, ž in DEBS, 2022, accepted for publication.
- B. Hilprecht and C. Binnig, łZero-shot cost models for out-of-the-box learned cost prediction, ž arXiv preprint, 2022.