The major physics experiments of the next twenty years will break new ground in our understanding of the fundamental interactions, structures and symmetries that govern the nature of matter and spacetime. Realizing the scientific wealth of these experiments presents new problems in data access, processing and distribution, and collaboration across national and international networks, on a scale unprecedented in the history of science.
The challenges include:
- The extraction of small or subtle new physics signals from large and potentially overwhelming backgrounds,
- Providing rapid access to event samples and subsets drawn from massive data stores, rising from 100s of Terabytes in 2000 to Petabytes by 2005, to 100 Petabytes by 2010,
- Providing secure, efficient and transparent access to heterogeneous worldwide-distributed computing and data handling resources, across an ensemble of networks of varying capability and reliability,
- Tracking the state and usage patterns at each site and across sites, in order to make rapid turnaround as well as efficient utilization of global resources possible
- Providing the collaborative infrastructure that will make it possible for physicists in all world regions to contribute effectively to the analysis and the physics results, including from their home institutions.
In my talk I will provide a perspective on the key computing, networking and software issues, and the ongoing R&D aimed at building a worldwide-distributed system to meet these diverse challenges. Over the last year this concept has evolved into that of a data-intensive, hierarchical "Data Grid" of national centers linked to the principal center at the experimental site, and to regional and local centers. I will summarize the role of recent projects on distributed systems and "Grids" in the US and Europe. I will touch on the synergy between these developments and work in other fields, and briefly discuss the potential importance for scientific research and industry in the coming years.