Academia.eduAcademia.edu

Outline

Panda

Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems

https://doi.org/10.1145/3524860.3543281

Abstract

Distributed Stream Processing (DSP) systems highly rely on parallelism mechanisms to deliver high performance in terms of latency and throughput. Yet the development of such parallel systems altogether comes with numerous challenges. In this paper, we focus on how to select appropriate resources for parallel stream processing under the presence of highly dynamic and unseen workloads. We present PANDA that provides a novel learned approach for highly efficient and parallel DSP systems. The main idea is to provide accurate resource estimates and hence optimal parallelism degree using zero-shot cost models to ensure the performance demands. CCS CONCEPTS • Computer systems organization → Real-time systems.

References (7)

  1. S. Frischbier, M. Paic, A. Echler, and C. Roth, łManaging the complexity of pro- cessing financial data at scale -an experience report, ž in CSDM, 2019, pp. 14ś26.
  2. X. Jiang, łBlink: How Alibaba Uses Apache Flink,ž https://www.ververica.com/ blog/blink-flink-alibaba-search, 2021, [Online; accessed 27-05-2022].
  3. X. Liu and R. Buyya, łResource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions, ž in CSUR, vol. 53, pp. 1ś41, 2020.
  4. R. Mayer, B. Koldehofe, and K. Rothermel, łPredictable low-latency event detection with parallel complex event processing, ž in IEEE IoTJ, vol. 2, pp. 274ś286, 2015.
  5. V. Cardellini, F. L. Presti, M. Nardelli, and G. R. Russo, łDecentralized self- adaptation for elastic data stream processing, ž in FGCS, vol. 87, pp. 171ś185, 2018.
  6. R. Heinrich, M. Luthra, H. Kornmayer, and C. Binnig, łZero-shot cost models for distributed stream processing, ž in DEBS, 2022, accepted for publication.
  7. B. Hilprecht and C. Binnig, łZero-shot cost models for out-of-the-box learned cost prediction, ž arXiv preprint, 2022.