Tertiary Storage Organization for Large Multidimensional Datasets
Large multidimensional datasets are found in diverse application areas, such as data ware-housing... more Large multidimensional datasets are found in diverse application areas, such as data ware-housing [6], satellite data processing, and high-energy physics [9]. According to current estimates, these datasets are expected to hold terabytes of data. Since these datasets hold mainly ...
d h a r y , G. M e m i k , M . Kandemir,* S. M o r e , G. T h i r u v a t h u k a l , t and A, S ... more d h a r y , G. M e m i k , M . Kandemir,* S. M o r e , G. T h i r u v a t h u k a l , t and A, S i n g h t C e n t e r for Parallel a n d D i s t r i b u t e d C o m p u t i n g D e p a r t m e n t o f E l e c t r i c a l and C o m p u t e r E n g i n e e r i n g N o r t h w e s t e r n U n i v e r s i t y E v a n s t o n , I L 60208 { x h s h e n , w k l i a o , c h o u d h a r , m e m i k , s s m o r e } @ e c e . n w u . e d u
threaded runtime library for parallel I/O. We extend the multi-threading concept to separate the ... more threaded runtime library for parallel I/O. We extend the multi-threading concept to separate the compute and I/O tasks in two separate threads of control. Multi-threading in our design permits a) asynchronous I/O even if the underlying file system does not support asynchronous I/O; b) copy avoidance from the I/O thread to the compute thread by sharing address space; and c) a capability to perform collective I/O asynchronously without blocking the compute threads. Further, this paper presents techniques for collective I/O which maximize load balance and concurrency while reducing communication overhead in an integrated fashion. Performance results on IBM SP2 for various data distributions and access patterns are presented. The results show that there is a tradeoff between the amount of concurrency in I/O and the buffer size designated for I/O; and there is an optimal buffer size beyond which benefits of larger requests diminish due to large communication overheads.
We have implemented Passion on Intel's Paragon, Touchstone Delta, and iPSC/860 systems, and ... more We have implemented Passion on Intel's Paragon, Touchstone Delta, and iPSC/860 systems, and on the IBM SP system. We have also made it publicly available through the World Wide Web (http://www.cat.syr.edu/passion.html). We are in the process of porting the ...
With the increasing number of scientific applications manipulating huge amounts of data, effectiv... more With the increasing number of scientific applications manipulating huge amounts of data, effective high-level data management is an increasingly important problem. Unfortunately, so far the solutions to the high-level data management problem either require deep understanding of specific storage architectures and file layouts (as in high-performance file storage systems) or produce unsatisfactory I/O performance in exchange for ease-of-use and portability (as in relational DBMSs).
Tertiary storage systems are used when secondary storage can not satisfy the data storage require... more Tertiary storage systems are used when secondary storage can not satisfy the data storage requirements and/or it is a more cost effective option. The new application domains require on-demand retrieval of data from these devices. This paper investigates issues in optimizing I/O time for a query whose data resides on automated tertiary storage containing multiple storage devices.
Extended collective I/O for efficient retrieval of large objects
Object-relational databases management systems (OR-DBMS) extend the capabilities of the relationa... more Object-relational databases management systems (OR-DBMS) extend the capabilities of the relational databases by allowing definition of new data types and methods to op-erate on these data types while retaining most of the rela-tional model semantics. In this paper, we ...
With the increasing number of scientific applications manipulating huge amounts of data, effectiv... more With the increasing number of scientific applications manipulating huge amounts of data, effective high-level data management is an increasingly important problem. Unfortunately, so far the solutions to the high‐level data management problem either require deep understanding of specific storage architectures and file layouts (as in high-performance file storage systems) or produce unsatisfactory I/O performance in exchange for ease-of-use and portability (as in relational DBMSs). In this paper we present a novel application development environment which is built around an active meta-data management system (MDMS) to handle high-level data in an effective manner. The key components of our three-tiered architecture are user application, the MDMS, and a hierarchical storage system (HSS). Our environment overcomes the performance problems of pure database-oriented solutions, while maintaining their advantages in terms of ease-of-use and portability. The high levels of performance are achieved by the MDMS, with the aid of user-specified, performance-oriented directives. Our environment supports a simple, easy-to-use yet powerful user interface, leaving the task of choosing appropriate I/O techniques for the application at hand to the MDMS. We discuss the importance of an active MDMS and show how the three components of our environment, namely the application, the MDMS, and the HSS, fit together. We also report performance numbers from our ongoing implementation and illustrate that significant improvements are made possible without undue programming effort.
Page 1. PASSION Runtime Library for the Intel Paragon* Alok Choudhary Rajesh BordawekarSachin Mor... more Page 1. PASSION Runtime Library for the Intel Paragon* Alok Choudhary Rajesh BordawekarSachin More K. Sivaram Dept. of Electrical and Computer Engineering Syracuse University, Syracuse, NY 13244 choudhar, rajesh, ssmore, sivaram @cat.syr.edu ...
With the increasing number of scientific applications manipulating huge amounts of data, effectiv... more With the increasing number of scientific applications manipulating huge amounts of data, effective high-level data management is an increasingly important problem. Unfortunately, so far the solutions to the high-level data management problem either require deep understanding of specific storage architectures and file layouts (as in high-performance file storage systems) or produce unsatisfactory I/O performance in exchange for ease-of-use and portability (as in relational DBMSs).
This paper presents an approach to recognize human facial expressions for human-robot interaction... more This paper presents an approach to recognize human facial expressions for human-robot interaction. For this, the facial features, especially eyes and lip are extracted and approximated using Bézier curves representing the relationship between the motion of features and changes of expressions. For face detection, color segmentation based on the novel idea of fuzzy classification has been employed that manipulates ambiguity in colors. Experimental results demonstrate that this technique can robustly classify skin region and non-skin region. In order to decide whether the skin region is face or not, largest connectivity analysis has been employed. This method can recognize the facial expression category, as well as the degree of facial expression change. Finally, the system has been implemented by issuing facial expression commands to a manipulator robot.
Uploads
Papers by Sachin More