Academia.eduAcademia.edu

Andrew File System

description8 papers
group2 followers
lightbulbAbout this topic
The Andrew File System (AFS) is a distributed file system that enables efficient file sharing across multiple networked computers. It provides a unified namespace, allowing users to access files stored on remote servers as if they were local, while supporting features like caching, replication, and security through authentication mechanisms.
lightbulbAbout this topic
The Andrew File System (AFS) is a distributed file system that enables efficient file sharing across multiple networked computers. It provides a unified namespace, allowing users to access files stored on remote servers as if they were local, while supporting features like caching, replication, and security through authentication mechanisms.

Key research themes

1. How does the Andrew File System (AFS) balance large-scale distributed file sharing with UNIX compatibility and local caching strategies?

This theme investigates how AFS was designed to provide a campus-wide distributed file system supporting thousands of workstations with performance and availability considerations. It specifically looks at the design decisions around UNIX compatibility, whole-file caching on local disks, and server architecture to balance scalability, network efficiency, and user experience.

Key finding: The Andrew File System was architected to support up to 7,000 users by making the client-side caching of entire files a central design choice. This whole-file caching improves availability and reduces network loads by... Read more
Key finding: While not directly about AFS, the Nebula file system discussed in this work offers an alternative model emphasizing attribute-based file objects and dynamic views rather than traditional directories, contrasting with AFS’s... Read more
by AMEY THAKUR and 
1 more
Key finding: This comparative study highlights how AFS achieves fault tolerance and scalability through its distributed cache coherence and local disk caching, providing high availability with multiple small servers rather than large... Read more

2. What architectural and protocol innovations enable high-performance network-attached storage in file systems like AFS and its successors?

This theme explores how newer file system designs, including those influenced by AFS, address client overhead and data transfer efficiency through innovations in network protocols (e.g., RDMA) and user-level file system integration. The focus is on architectural adaptations that minimize kernel involvement, optimize caching and locking, and integrate with high-speed networks to support demanding workloads.

Key finding: The Direct Access File System (DAFS) builds on the client-server model exemplified by AFS by implementing a user-level client architecture combined with RDMA-capable networks to reduce overhead, eliminate kernel bottlenecks,... Read more
Key finding: The Frog framework demonstrates a modular approach to integrate multiple context-aware I/O optimizations in a unified file system interface. By abstracting solutions as views, Frog reduces overhead from traditional context... Read more
Key finding: This work investigates consistent and efficient data sharing mechanisms over iSCSI-based SAN environments, focusing on cache coherence and concurrency control adapted to high-latency, TCP/IP-based storage networks. Techniques... Read more

3. How do distributed file systems manage data durability, replication, and fault tolerance across large-scale, multi-node environments?

This theme covers mechanisms for ensuring data survival and fault tolerance in distributed file systems, including replication strategies, erasure codes, and methods to handle node and network failures. It emphasizes resilience techniques crucial for systems like AFS that scale to thousands of nodes and explore variations in centralized versus decentralized storage management.

Key finding: The paper models peer-to-peer storage systems' data longevity under node volatility and availability, comparing replication and erasure coding. It finds that simple replication may outperform complex codes in environments... Read more
Key finding: This study demonstrates the feasibility of a distributed disk storage system pooling underutilized storage across networked machines with replication to ensure resilience against node failures. The system’s file splitting,... Read more
Key finding: HighLight extends a log-structured file system design to hierarchical storage incorporating robotic tertiary storage, optimizing write performance and arranging data migration policies to improve read accesses. The... Read more

All papers in Andrew File System

The BaBar experiment has accumulated many terabytes of data on particle physics reactions, accessed by a community of hundreds of users. Typical analysis tasks are C++ programs, individually written by the user, using shared templates and... more
1-3Department of CSE, Ramaiah Institute of Technology, Bengaluru, India -------------------------------------------------------------------------***---------------------------------------------------------------------AbstractDistributed... more
The amount of immutable files, such as images, video clips, audio files, and e-mail messages, is expected to grow significantly, as users actively generate, distribute, share, and re-use digital contents. In this paper, we present BeanFS,... more
1-3Department of CSE, Ramaiah Institute of Technology, Bengaluru, India -------------------------------------------------------------------------***---------------------------------------------------------------------AbstractDistributed... more
In preparation of the experiment at CERN's Large Hadron Collider (LHC), the ALICE collaboration has developed AliEn, a production environment that implements several components of the Grid paradigm needed to simulate, reconstruct and... more
by AMEY THAKUR and 
1 more
Distributed File Systems are the backbone of how large volumes of data are stored. Hadoop File Systems, Google File Systems, and Network File Systems have all shifted the way data is maintained on servers. In terms of performance, fault... more
by AMEY THAKUR and 
1 more
Distributed File Systems are the backbone of how large volumes of data are stored. Hadoop File Systems, Google File Systems, and Network File Systems have all shifted the way data is maintained on servers. In terms of performance, fault... more
Conclusions BdbServer++ has highlighted some of the areas of opportunity for the current middleware. This has been fed back into the development process for future releases. The issue of data location is however, one where ideal solutions... more
This paper describes how the London e-Science Centre cluster MARS, a production 400+ Opteron CPU cluster, was integrated into the production Large Hadron Collider Compute Grid. It describes the practical issues that we encountered when... more
We present two versions of a grid job submission system produced for the BaBar experiment. Both use globus job submission to process data spread across various sites, producing output which can be combined for analysis. The problems... more
Objective The objective of this document is to survey all available HEP use case documents, select from them the use cases which are common to Grid-like computing systems and identify the implications on the metadata catalogue which is to... more
This paper discusses the use of e-Science Grid in providing computational resources for modern international High Energy Physics (HEP) experiments. We investigate the suitability of the current generation of Grid software to provide the... more
The BABAR Collaboration, based at Stanford Linear Accelerator Center (SLAC), Stanford, US, has been performing physics reconstruction, simulation studies and data analysis for 8 years using a number of compute farms around the world.... more
The BaBar experiment has been taking data since 1999. In 2001 the computing group started to evaluate the possibility to evolve toward a distributed computing model in a grid environment. We built a prototype system, based on the European... more
The BaBar experiment involves 500+ physicists spread across the world, with a requirement to access and analyse hundreds of Terabytes of data. Grid based tools are being increasingly used for the manipulation of data and metadata, and... more
We have created a lightweight (server and client) intuitive command line interface through which users can submit jobs with complex software dependencies. We have added functionality that is specific to Babar jobs. gsub solves the... more
The BaBar experiment has accumulated many terabytes of data on particle physics reactions, accessed by a community of hundreds of users. Typical analysis tasks are C++ programs, individually written by the user, using shared templates and... more
Download research papers for free!