Academia.eduAcademia.edu

Distributed data structures

description23 papers
group8 followers
lightbulbAbout this topic
Distributed data structures are data organization methods that enable the storage, management, and processing of data across multiple networked computers. They facilitate efficient data access and manipulation in distributed computing environments, ensuring consistency, fault tolerance, and scalability in handling large datasets.
lightbulbAbout this topic
Distributed data structures are data organization methods that enable the storage, management, and processing of data across multiple networked computers. They facilitate efficient data access and manipulation in distributed computing environments, ensuring consistency, fault tolerance, and scalability in handling large datasets.

Key research themes

1. How can distributed data structures achieve scalability and high availability in large-scale, failure-prone environments?

This research theme investigates the design principles and system architectures that enable distributed data structures to support very large datasets across numerous commodity nodes while maintaining strong availability and fault tolerance. It focuses on mechanisms to handle continuous component failures, replication across geographically distributed sites, and models balancing scalability with consistency guarantees to address real-world operational challenges in distributed storage systems.

Key finding: Cassandra's design enables a decentralized, highly available, and scalable storage system supporting very high write throughput (billions of writes per day) and efficient replication across geo-distributed data centers. It... Read more
Key finding: Multipol addresses the implementation and performance trade-offs of distributed data structures crucial for irregular applications by combining replication, partitioning, and dynamic caching to optimize both load balance and... Read more
Key finding: This work expands on applying replication and partitioning techniques in distributed data structures for irregular applications with complex communication and control patterns. By introducing a relaxed consistency model akin... Read more
Key finding: The survey formalizes scalable distributed data structures (SDDS) as systems that dynamically redistribute growing datasets across multiple servers to support scalability in data size and access rates without significant... Read more

2. What programming models and abstractions facilitate the development and efficient use of distributed data structures for irregular and dynamic workloads?

This theme examines programming interface designs, library abstractions, and runtime support that enable developers to efficiently implement distributed data structures with irregular data access patterns, dynamic task graphs, and unpredictable communication costs. It explores how API design, partitioning strategies, and consistency models interplay to achieve both programming productivity and run-time performance in diverse distributed memory systems.

Key finding: DASH presents a compiler-free C++ template library implementing the partitioned global address space (PGAS) model, offering global-view data structures with owner-computes semantics and support for multidimensional... Read more
Key finding: This work introduces a communication-efficient algorithm for constructing decision trees on vertically partitioned (heterogeneously distributed) data, minimizing data movement while approximating centralized accuracy. By... Read more
Key finding: The paper analyzes a randomized algorithm for load balancing irregular, tree-structured computations on distributed multiprocessors, showing with high probability that total execution time is within a small constant factor of... Read more
Key finding: Beyond fault tolerance and scalability, the Multipol library provides a common runtime and a suite of parallel data structure abstractions that hide communication and synchronization complexities from the programmer. Its... Read more

3. How can distributed data structures optimize query processing and data management efficiency for large-scale and heterogeneous distributed data?

This research area focuses on algorithmic innovations and system architectures that enable efficient query execution, data storage, and resource management over large-scale distributed datasets. It emphasizes minimizing communication costs, optimizing data partitioning and replication, and adapting to heterogeneous environments to achieve scalable and performant data operations such as range queries, joins, and visualization pipelines.

Key finding: ART introduces a scalable exponential-tree overlay structure providing sub-logarithmic communication costs (O(log^2_b log N)) for range queries, point lookups, and join/leave operations with high probability in large... Read more
Key finding: This work presents a middleware system leveraging an Enhanced Time-Space Partitioning (ETSP) tree and Logistical Networking depots to manage and visualize terabytescale, time-varying scientific dataset distributed across... Read more
Key finding: The paper analyzes communication complexity bounds for distributed set-join variants, including set-intersection, equality, and threshold joins. It derives optimal randomized communication protocols for set-intersection and... Read more
Key finding: This work proposes a scalable, virtualized distributed database architecture integrating data clustering, horizontal partitioning, and Single Query Multiple Database (SQMD) middleware mechanisms to provide transparent,... Read more

All papers in Distributed data structures

GeliĢen teknolojik ve bilimsel yeniliklerin bir sonucu olarak tüm alanlarda olduğu gibi ticari faaliyetlerin de muhasebe ilkeleri doğrultusunda düzenli ve sistematik bir Ģekilde kayıt altına alınması sürdürülebilir bir ekonomik yapı için... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
Öz İnternetin hayatımızın her alanına girmesi insan yaşamı için birçok kolaylık ve fayda getirmiştir. Öte yandan internet, doğası gereği sorunlar ve tehditleri de beraberinde getirmiştir. Bireyler bakımından kişisel verilerin izinsiz ele... more
2008 yilinda ilk ortaya cikmasindan bugune kadar gecen surede hem kamu hem de ozel sektor tarafindan yakindan ilgi ile takip edilen blokzincir teknolojisi; kripto paralar, vatandaslik bilgilerinin kaydi ve yonetimi, elektronik oylama,... more
Parallelism plays a significant role in high-performance computing systems, from large clusters of computers to chip-multithreading (CMT) processors. Performance of the parallel systems comes not only from concurrently running more... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
Kripto paralar (cryptocurrency), eşler arası (Peer-to-Peer) mimaride birbirine bağlı madenci düğümü adı verilen bilgisayarlara ve blokzinciri yapısında tutulan kayıt sistemine dayanmaktadır. Bu sistemler sadece bir para birimi... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
Bu makalede, son yıllarda hızla gelişmekte olan teknolojilerden biri olan blokzincir teknolojisinin temel prensipleri ve bir tedarik zincirinde bu teknolojiden nasıl yararlanılacağı üzerine çalışılmıştır. Çalışmanın amacı, Türkiye'nin... more
Soru 6: Blokzinciri teknolojisi turizm sektörünü nasıl etkileyecektir? Yönetici-1 Misafir portföyüne bağlı olduğunu düşünüyorum. Yönetici-2 Konu hakkında bilgi sahibi değilim. Yönetici-3 Kredi kartları gibi hızlı uyum sağlanacaktır.... more
Bu kitap , eğitim, sağlık, emlak, ulaşım, bankacılık, iş, tedarik zinciri yönetimi, e-ticaret ve merkezi olmayan finans gibi gerçek dünya faaliyetlerini yürütmek için blockchain teknolojisinin çok yönlü doğasını ve kurcalamaya dayanıklı... more
Çalışmanın amacı blokzincir teknolojisinin kamu yönetimine etkisini araştırmaktır. Bu amacın gerçekleşebilmesi için blokzincir teknolojisinin kamu hizmetlerine, mevcut ve potansiyel etkileri ortaya konulmaktadır. Çalışma kısaca D5... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
The huge amount of data available from Internet informa- tion sources has focused much attention on the sharing of distributed information through Peer Data Management Sys- tems (PDMSs). In a PDMS, peers have a schema on their local data,... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
Blockchain alıcı ve satıcı tarafın herhangi bir onaylayıcı üçüncü taraf ihtiyacı olmaksızın doğrudan kendi aralarında güvenli bir şekilde alışveriş yapmasına izin veren, yapılan tüm işlemlerin şifrelenmiş bir şekilde bloklar üzerinde... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
Now a days there is large amount of information is available with the world and is stored in the databases and applications. These databases may be centralized or distributed depending on the need of application but the primary concern... more
Now a days there is large amount of information is available with the world and is stored in the databases and applications. These databases may be centralized or distributed depending on the need of application but the primary concern... more
We focus on range query processing on large-scale, typically distributed infrastructures, such as clouds of thousands of nodes of shared-datacenters, of p2p distributed overlays, etc. In such distributed environments, efficient range... more
Güvenlik ve Gizlilik kavramları gelişen teknoloji ile birlikte günümüz dünyasında önemini arttırmıştır. Güvenlik kavramı, geçmiş zamanlarda fiziki tehditler ve bu tehditlere karşı alınabilecek önlemler ile sınırlı iken günümüzde özellikle... more
With the age of Industry 4.0, many fields have accelerated the digitization process. Examples of this can be reproduced such as Digital Supply Chain, Logistics 4.0, Marketing 4.0. In 2008, the foundations of a new digital transfer system... more
Çalışmanın amacı blokzincir teknolojisinin kamu yönetimine etkisini araştırmaktır. Bu amacın gerçekleşebilmesi için blokzincir teknolojisinin kamu hizmetlerine, mevcut ve potansiyel etkileri ortaya konulmaktadır. Çalışma kısaca D5... more
Güvenlik ve Gizlilik kavramları gelişen teknoloji ile birlikte günümüz dünyasında önemini arttırmıştır. Güvenlik kavramı, geçmiş zamanlarda fiziki tehditler ve bu tehditlere karşı alınabilecek önlemler ile sınırlı iken günümüzde özellikle... more
Sağlık alanı bilgi teknolojilerinin (BT) kullanıldığı en önemli alanlardan biridir. Hastane bilgi sistemlerinde ve tıbbi cihazlarda yıllardır kullanılan bu teknoloji, nesnelerin internetinin (IoT: Internet of Things) yaygın kullanımıyla... more
Nowadays, effective business performance depends on digital competitive factors and its ability to transform corporate capabilities in the light of digitalization. In this study, the evolving role of automated systems for global business... more
The technological changes in recent years have played a major role in changing the rules in business life. After technological innovations such as autonomous tools, the Internet of Things, and industry 4.0, the new issue that has emerged... more
Abstract: Systems are recommended to monitor some of the symptoms of COVID-19. Since the use of mobile phones has become very common among people, one of the most convenient solutions may be to use mobile phones for this process.... more
Kripto paraların elde ettiği başarı sonrası dikkatleri üzerine çekmeyi başaran blokzincir teknolojisi, gelişmekte olan ve popüler bir çalışma konusudur. Merkeziyetçi olmayan yapısı, tek yönlü ve silinemez veri kaydı, şifrelenmiş blok... more
Abstract Blockchain technology, which succeeds in attracting attention after the success of cryptocurrency, is a developing and popular subject of study. With its decentralized structure, decentralized and indelible data recording,... more
As technology advances and the link between people and machines grows, system and data security become more important. Attackers try to find gaps by examining systems and sometimes succeed. Successful attacks lead to material and moral... more
Currently, there are many effective techniques that are used for filtering spam emails. However, spammers have mostly identified the weakness of those methods in order to bypass current detection systems. In... more
Chair of Advisory Committee: Dr. Dmitri Loguinov Finding near-duplicate documents is an interesting problem but the existing methods are not suitable for large scale datasets and memory constrained systems. In this work, we developed... more
Download research papers for free!