Grid computing has recently gained in popularity. Grid applications can be very demanding of the ... more Grid computing has recently gained in popularity. Grid applications can be very demanding of the data storage facilities in the Grid. The existing data grid services are often insufficient and additional optimization of the data access is necessary. This paper presents research concerning the optimization aspects of data management for Hierarchical Storage Management (HSM) systems in the Grid.
Concurrency and Computation: Practice and Experience, Nov 4, 2016
Sensitivity analysis is widely used in numerical simulations applied in industry. The robustness ... more Sensitivity analysis is widely used in numerical simulations applied in industry. The robustness of such applications is crucial, which means they have to be fast and precise at the same time. However, the conventional approach to sensitivity analysis assumes realization of multiple execution of computationally intensive simulations to discover input/output dependencies. In this paper, we present a novel computational approach for performing large-scale sensitivity analysis integrated with an extended platform for parameter studies-Scalarm-to make use of modern e-infrastructures for distribution and parallelization purposes, profitable for complicated industrial problems.
Metadata Organization and Management for Globalization of Data Access with Onedata
Lecture Notes in Computer Science, 2016
The Big Data revolution means that large amounts of data have not only to be stored, but also to ... more The Big Data revolution means that large amounts of data have not only to be stored, but also to be processed to unlock the potential of access to information and knowledge for scientific research. As a result, scientific communities require simple and convenient global access to data which is effective, secure and shareable. In this article we analyze how researchers use their data working in large scientific projects and show how their requirements may be satisfied with our solution called Onedata. Major technical mechanisms of metadata management and organization are described.
The concept of Open Science emerges as a powerful new trend, allowing researchers to exchange and... more The concept of Open Science emerges as a powerful new trend, allowing researchers to exchange and reuse valuable knowledge, data and analyses. Innovative tools are needed to facilitate such global scientific collaboration, which is the main objective of the Onedata system. It aspires to provide a Open Science platform based on openness and decentralization. To achieve this, a distributed location service must be introduced that will allow to locate resources in this vast environment. This paper identifies Kademlia DHT networks as a viable solution, which nevertheless requires improvements to be applicable in Onedata. It analyzes the consistency problems and attack resistance deficiencies of Kademlia and proposes additional consistency checks as a solution. Implications of such modifications are discussed and the modified Kademlia is compared against the original Kademlia algorithm.
Parameter studies on heterogeneous computing infrastructures with the Scalarm platform
The main goal of this tutorial is to demonstrate the Scalarm platform as a tool supporting parame... more The main goal of this tutorial is to demonstrate the Scalarm platform as a tool supporting parameter studies on different computing infrastructures like clusters, grids and clouds. Parameter study (also called parameter sweep) is an approach where the same application is executed many times with different input parameter values. Afterwards, results from every execution are collected to identify trends or anomalies in the data that may lead to new insights. Despite their simplicity, parameter studies are commonly used in many fields of science, including computational fluid dynamics, particle physics, and discrete-event simulation. It is arguable the most popular way of scientific computing organization.
The recent years have significantly changed the perception of web services and data storages, as ... more The recent years have significantly changed the perception of web services and data storages, as clouds became a big part of IT market. New challenges appear in the field of scalable web systems, which become bigger and more complex. One of them is designing load balancing algorithms that could allow for optimal utilization of servers' resources in large, distributed systems. This paper presents an algorithm called Two-Level Load Balancing, which has been implemented and evaluated in Onedata-a global data access system. A study of Onedata architecture, request types and use cases has been performed to determine the requirements of load balancing set by similar, highly scalable distributed systems. The algorithm was designed to match these requirements, and it was achieved by using a synergy of
Service Level Agreement Metrics for Real-Time Application on the Grid
Springer eBooks, May 28, 2008
ABSTRACT Highly demanding application running on grids needs carefully prepared environments. Rea... more ABSTRACT Highly demanding application running on grids needs carefully prepared environments. Real-time High Energy Physics (HEP) application from Int.eu.grid project is a good example of an application with requirements difficult to fulfill by typical grid environments. In the paper we present Service Level Agreement metrics which are used by application’s dedicated virtual organization (HEP VO) to sign SLA with service providers. HEP VO with signed SLAs is able to guarantee sufficient service quality for the application. These SLAs are enforced using presented VO Portal.
Public organisations often face knowledge management problems caused by organisational mobility, ... more Public organisations often face knowledge management problems caused by organisational mobility, the continual and pervasive movement of staff between units and departments. This can be a strength, but also introduces problems of loss of experience and reduced efficiency and effectiveness in working. The Pellucid project is developing a knowledge management system to assist in such situations. The project's three pilot applications are described, and an abstracted view of their knowledge management needs is developed. An outline of key parts of the technical solution is given.
The progress made in the field of Cloud computing and the continuously growing users demand for s... more The progress made in the field of Cloud computing and the continuously growing users demand for services with guaranteed storage performance parameters bring new challenges. The storage system monitoring, resource scheduling and performance prediction are essential for successful operation of the given distributed environment and for fulfillment of the Service Level Agreement. Taking into account the heterogeneity of storage resources in distributed environments it is essential to provide a transparency of monitored storage system performance parameters. In this paper we present a common storage system model regarding Quality of Service requirements and dynamics of performance parameters. We also present the process of the storage ontology development based on this model, and we show an use-case of the proposed ontology in a storage monitoring service.
In this paper we present the results of a two-year study aimed at developing a full-fledged compu... more In this paper we present the results of a two-year study aimed at developing a full-fledged computer environment supporting post-stroke rehabilitation. The system was designed by a team of computer scientists, psychologists and physiotherapists. It adopts a holistic approach to rehabilitation. In order to extend the rehabilitation process, the applied methods include a remote rehabilitation stage which can be carried out of at the patient's home. The paper presents a distributed system architecture as well as results achieved by patients prior to and following a three-month therapy based on the presented system.
In affiliate marketing, an affiliate offers to handle the marketing effort selling products of ot... more In affiliate marketing, an affiliate offers to handle the marketing effort selling products of other companies. Click-fraud is damaging to affiliate marketers as they increase the cost of internet traffic. There is a need for a solution that has an economic incentive to protect marketers while providing them with data they need to reason about the traffic quality. In our solution, we propose a set of interpretable flags explainable ones to describe the traffic. Given the different needs of marketers, differences in traffic quality across campaigns and the noisy nature of internet traffic, we propose the use of equality testing of two proportions to highlight flags which are important in certain situations. We present measurements of real-world traffic using these flags.
Effective and Scalable Data Access Control in Onedata Large Scale Distributed Virtual File System
Procedia Computer Science, 2017
Abstract Nowadays, as large amounts of data are generated, either from experiments, satellite ima... more Abstract Nowadays, as large amounts of data are generated, either from experiments, satellite imagery or via simulations, access to this data becomes challenging for users who need to further process them, since existing data management makes it difficult to effectively access and share large data sets. In this paper we present an approach to enabling easy and secure collaborations based on the state of the art authentication and authorization mechanisms, advanced group/role mechanism for flexible authorization management and support for identity mapping between local systems, as applied in an eventually consistent distributed file system called Onedata.
Open-data research is an important factor accelerating the production and analysis of scientific ... more Open-data research is an important factor accelerating the production and analysis of scientific results as well as worldwide collaboration; still, very little data is being shared at scale. The aim of this article is to analyze existing data-access solutions along with their usage limitations. After analyzing the existing solutions and data-access stakeholder needs, the authors propose their own vision of a data-access model.
Uploads
Papers by Renata G Słota