National Science Foundation support for scientific cyberinfrastructure dates to the 1960s. Since about 2000, however, efforts in cyberinfrastructure development have gathered momentum, guided by an increasingly comprehensive vision. Yet...
moreNational Science Foundation support for scientific cyberinfrastructure dates to the 1960s. Since about 2000, however, efforts in cyberinfrastructure development have gathered momentum, guided by an increasingly comprehensive vision. Yet assembling the range of NSF-sponsored projects into a genuine infrastructure -highly reliable, widely accessible basic capabilities and services supporting the full range of scientific work -remains an elusive goal. Close study of other infrastructures, from railroads and electric power grids to telephone, cellular services, and the Internet, provides insights that can help guide and consolidate the NSF vision. Since the 1980s, historians, sociologists, and information scientists have been studying how and why infrastructures form and evolve; how they work; and how they (sometimes) disintegrate or fail. In September 2006, a three-day NSF-funded workshop on "History and Theory of Infrastructure: Lessons for New Scientific Cyberinfrastructures" took place at the University of Michigan. Participants included experts in social and historical studies of infrastructure development, and domain scientists, information scientists, and NSF program officers involved in building, using, and funding cyberinfrastructure. The goal was to distill concepts, stories, metaphors, and parallels that might help realize the NSF vision for scientific cyberinfrastructure. This report summarizes the workshop findings, and outlines a research agenda for the future. Social and historical analyses reveal some base-level tensions that complicate the work of infrastructural development. These include: • Time, e.g. short-term funding decisions vs. the longer time scales over which infrastructures typically grow and take hold • Scale, e.g. disconnects between global interoperability and local optimization • Agency, e.g. navigating processes of planned vs. emergent change in complex and multiply-determined systems. Such complications challenge simple notions of infrastructure building as a planned, orderly, and mechanical act. They also suggest that boundaries between technical and social solutions are mobile, in both directions: the path between the technological and the social is not static and there is no one correct mapping. Robust cyberinfrastructure will develop only when social, organizational, and cultural issues are resolved in tandem with the creation of technology-based services. Sustained and proactive attention to these concerns will be critical to long-term success. Dynamics. Historical infrastructures -the automobile/gasoline/roadway system, electrical grids, railways, telephony, and most recently the Internet -become ubiquitous, accessible, reliable, and transparent as they mature. The initial stage in infrastructure formation is system-building, characterized by the deliberate and successful design of technology-based services. Next, technology transfer across domains and locations results in variations on the original design, as well as the emergence of competing systems. Infrastructures typically form only when these various systems merge, in a process of consolidation characterized by gateways that allow dissimilar systems to be linked into networks. In this phase, standardization and inter-organizational communication techniques are critical. As multiple systems assemble into networks, and Understanding Infrastructure ii networks into webs or "internetworks," early choices constrain the options available moving forward, creating what historical economists call "path dependence. Tensions. Transparent, reliable infrastructural services create vast benefits, but there are always losers as well as winners in infrastructure formation. Questions of ownership, management, control, and access are always present. For example: • Who decides on rules and conventions for sharing, storing, and preserving data? • Local variation vs. global standards: how do we resolve frictions between localized routines and cultures that stand in the way of effective collaboration? • How can national cyberinfrastructure development move forward without compromising possibilities for international or even global infrastructure formation? Design. These and other tensions inherent to infrastructure growth present imperatives to develop navigation strategies that recognize the likelihood of unforeseen (and potentially negative) path dependence and/or institutional or cultural barriers to adoption. Cyberinfrastructure seeks to enable a decentralized research environment that: 1) permits distributed collaboration; 2) provides incentives for participation at all levels; and 3) encourages the advancement of cross-boundary and interdisciplinary scholarship. Since all three of these goals are simultaneously social and organizational in nature and central to the technical base, designing effective navigation strategies will depend on strategic collaborations between social, domain, and information scientists. In particular, comparative studies of cyberinfrastructure projects can reveal key factors in success (and failure). Research on practices of standardization and modularity can help retain the openness, flexibility, and broad-scale usability of cyberinfrastructure, minimizing the path-dependent effects of standard-setting. Recommendations: NSF should consider action in three broad areas. • Learning from cyberinfrastructure. By applying well-understood evaluation tools, we can assess and compare existing cyberinfrastructure projects, both in the US and abroad. The resulting knowledge can be used to improve reporting mechanisms and incentive structures. Cyberinfrastructure projects can also be instrumented to collect social and organizational data. • Improving cyberinfrastructural practice. Social science research can assist with NSF goals of training and enrolling professionals into the cyberinfrastructure-based research agenda. These goals may be achieved in part by improving diagnostics for current research environments, providing direct training for information managers, graduate students, and early-career faculty, and developing funding structures that support work on multiple time scales.