The US and LHC Computing

Tape storage at Brookhaven National Laboratory's Tier-1 computing center.
Tape storage at Brookhaven National Laboratory's Tier-1 computing center. Image Credit BNL
The LHC poses vast computing challenges to its scientific collaborations. The six experiments at the LHC produce 15 petabytes—15 million gigabytes—of data every year, which has to be stored, backed up, and made available to more than 8,000 scientists around the globe.

To handle this immense computing load, CERN works with groups worldwide to construct the computing system for the LHC. A key component of this system is a worldwide grid for LHC computing, the requirements for which have been driving the development of grid computing for the past decade. Eighteen U.S. institutions participate in the Worldwide LHC Computing Grid.

The US contributes to LHC computing in many ways. The US provides large amounts of tape, disk and processing capacity for the storage and analysis of acquired and simulated LHC data. High-speed 10-gigabit networks enable round-the-clock distribution of data in real time from CERN to two "Tier-1" computing facilities at Fermi National Accelerator Laboratory in Illinois and Brookhaven National Laboratory in New York. The data is further distributed to "Tier-2" and "Tier-3" computer centers across the country via the ESnet and Internet2 networking projects. It is at the Tier-2 and Tier-3 centers where physicists analyze data leading to LHC discoveries.

Cables at Fermilab's Grid Computing Center.
Cables at Fermilab's Grid Computing Center. Image Credit Fermilab
Grid computing and advanced networking are needed for thousands of scientists to effectively use LHC data. The Worldwide LHC Computing Grid integrates computer storage and processing power around the globe for the use of LHC physicists, and represents a leap forward in distributed computing technology. The US contributes to the Worldwide LHC Computing Grid through the Open Science Grid. The OSG, supported by the US Department of Energy and the National Science Foundation, is a national, shared cyberinfrastructure used not only by LHC scientists, but by scientists in many fields including biology, nanotechnology, geography and astrophysics.

The US contributes 34 percent of the worldwide computing capacity for the ATLAS experiment, and more than 53 percent of the capacity for the CMS experiment. The ATLAS Tier-1 center at Brookhaven National Laboratory and the CMS Tier-1 center at Fermilab combine to provide more than 13.7 petabytes of data storage and 175,000 hours per day of computation time. The Tier-2 centers, which include participation from fifteen universities and one national laboratory, together offer more than three petabytes of data storage and 283,000 hours per day of computing capacity for ATLAS and CMS simulation and analysis.

Scientists from the US ATLAS and US CMS software and computing projects make significant contributions to the software used to distribute and manage the data and computation needed by the experiments, as well as to the physics software through which researchers discover new fundamental properties of the universe.

The Open Science Grid and the LHC

Map of OSG sites in the United States
Open Science Grid institutions in the United States. Click on image for larger version.

The Open Science Grid is a distributed computing infrastructure for large-scale scientific research. The OSG contributes to the Worldwide LHC Computing Grid as the shared distributed computing facility used by the ATLAS and CMS experiments.

The OSG is built and operated by a consortium of 90 universities, national laboratories, scientific collaborations and software developers. It is supported by the National Science Foundation and the US Department of Energy Office of Science. The OSG supports not only physics experiments but also researchers from other fields, including astrophysics, bioinformatics and computer science. Currently the OSG has more than 70 sites in the US and sites in Brazil, Taiwan, Colombia, China, Italy and Mexico.

All LHC computing and storage sites in the US are members of the OSG and allow other scientific collaborations using the OSG to opportunistically use available resources. The OSG collaborates with the Enabling Grids for E-sciencE project in Europe to provide interoperating federated infrastructures which can be used transparently by the LHC experiments' software. In 2006-2007 OSG provided ATLAS and CMS with 30 percent of total processing cycles and moved more than 100 terabytes of data across seven sites.