Papers by Artyom Shaposhnikov
Algorithms for Efficient Data Compression in Databases using the Semantic Binary Model
World Multiconference on Systemics, Cybernetics and Informatics: Information Systems Development, Jul 22, 2001
Load balancing policy in a massively parallel semantic database
We are developing a massively parallel semantic database machine. Our basic semantic storage stru... more We are developing a massively parallel semantic database machine. Our basic semantic storage structure insures balanced load for most parts of the database. For the other parts of the database, a load balancing algorithm is proposed herein, which allows inexpensive dynamic rebalancing without substantial negative impact on queries and transactions.<<ETX>>

The majority of database benchmarks currently in use in the industry were designed for relational... more The majority of database benchmarks currently in use in the industry were designed for relational databases. A different class of benchmarks became required for object oriented databases once they appeared on the market. None of the currently existing benchmarks were designed to adequately exploit the distinctive features native to the semantic databases. A new semantic benchmark is proposed which allows evaluation of the performance of the features characteristic of semantic database applications. An application used in the benchmark represents a class of problems requiring databases with sparse data, complex inheritances and many-to-many relations. Such databases can be naturally accommodated by semantic databases. A predefined implementation is not enforced allowing a designer to choose the most efficient structures available in the DBMS tested. The second part of this paper compares the performance of Sem-ODB binary semantic database vs. one of the leading relational databases. The results of the benchmark are analyzed.
Load balancing policy in a massively parallel semantic database
Proceedings of the First International Conference on Massively Parallel Computing Systems (MPCS) The Challenges of General-Purpose and Special-Purpose Computing
We are developing a massively parallel semantic database machine. Our basic semantic storage stru... more We are developing a massively parallel semantic database machine. Our basic semantic storage structure insures balanced load for most parts of the database. For the other parts of the database, a load balancing algorithm is proposed herein, which allows inexpensive dynamic rebalancing without substantial negative impact on queries and transactions.<<ETX>>
IX7 uns ~and 4 2768 0.2243 0.1968 5 1.923 70.8443 6P. We wo uld like to nments on the des ign 1 a... more IX7 uns ~and 4 2768 0.2243 0.1968 5 1.923 70.8443 6P. We wo uld like to nments on the des ign 1 at Bloomsburg Uni-1 and design of exper--2 Tec hnical Summary. A.R7-4. Thinking Mach-I 1987) 'crformance analysis of • • the Connc<:ti on M< ll'h-Oi.H. Sys. <No ve mber Zahorjan • .I 'A<.Japllvc trihutcll sys tems : IF:EE ) !iVIay 1986) pp. 662-Y'is of the Connection l)(l)

Algorithms for efficient transaction management and consistent queries in client-server semantic object-oriented parallel databases
Large read-only or read-write transactions with a large read set and a small write set constitute... more Large read-only or read-write transactions with a large read set and a small write set constitute an important class of transactions used in such applications as data mining, data warehousing, statistical applications, and report generators. Such transactions are best supported with optimistic concurrency, because locking of large amounts of data for extended periods of time is not an acceptable solution. The abort rate in regular optimistic concurrency algorithms increases exponentially with the size of the transaction. The algorithm proposed in this dissertation solves this problem by using a new transaction scheduling technique that allows a large transaction to commit safely with significantly greater probability that can exceed several orders of magnitude versus regular optimistic concurrency algorithms. A performance simulation study and a formal proof of serializability and external consistency of the proposed algorithm are also presented. This dissertation also proposes a new query optimization technique (lazy queries). Lazy Queries is an adaptive query execution scheme which optimizes itself as the query runs. Lazy queries can be used to find an intersection of sub-queries in a very efficient way, which does not require full execution of large sub-queries nor does it require any statistical knowledge about the data. An efficient optimistic concurrency control algorithm used in a massively parallel B-tree with variable-length keys is introduced. B-trees with variable-length keys can be effectively used in a variety of database types. In particular, we show how such a B-tree was used in our implementation of a semantic object-oriented DBMS. The concurrency control algorithm uses semantically safe optimistic virtual "locks" that achieve very fine granularity in conflict detection. This algorithm ensures serializability and external consistency by using logical clocks and backward validation of transactional queries. A formal proof of correctness of the proposed algorithm is also presented.
High performance Lempel-Ziv compression using optimized longest string parsing and adaptive Huffman window size
Proceedings DCC 2000. Data Compression Conference
Summary form only given. We present optimizations that improve the compression and computational ... more Summary form only given. We present optimizations that improve the compression and computational efficiency of Lempel-Ziv (LZ77) and Huffman algorithms. The compression performance of the LZ77 algorithm can be improved by using an optimized longest match parsing strategy. Another factor that can be considered is the size of the reference to the matching string, which can vary due to the secondary Huffman compression. We present an efficient parsing algorithm that considers both factors while minimizing the computational requirements. Our second optimization technique optimizes static Huffman encoding by efficiently dividing the input into blocks of varying size with uneven character frequency distributions.
Very Large Data Bases, 2000
Semantic Binary Object-oriented Data Model (Sem-ODM) provides an expressive data model (similar t... more Semantic Binary Object-oriented Data Model (Sem-ODM) provides an expressive data model (similar to Object-oriented Data Models) with a well-known declarative query facility - SQL (similar to relational databases). Advantages of using Sem-ODM include (i.) friendlier and more intelligent generic user interfaces; (ii.) comprehensive enforcement of integrity constraints; (iii.) greater flexibility; (iv.) substantially shorter application programs; and (v.) easier query facility.
S-tree-a High Performance Multidimensional Spatial Index
Uploads
Papers by Artyom Shaposhnikov