The CMS Heavy Ion Run in 2024

CMS has a rich heavy-ion physics program covering measurement of jet quenching, quarkonium suppression, heavy flavor, flow correlations, and forward physics. Many of the physics goals involve rare probes of the quark-gluon plasma that rely on large data samples. The goal for Run 3 is about 6/nb of integrated luminosity. Several observables, such as heavy-flavor or particle-correlations studies, cannot be accessed with specific online triggers, and therefore a significant amount of minimum-bias data events is required. In addition to hadronic interactions, there is an increasing emphasis on ultra-peripheral collisions (UPC), which use the LHC as gamma-gamma or gamma-nucleus collider. These collisions require large low multiplicities samples to access extremely low-x probes physics objects.

It was estimated that with the actual LHC Run 3 schedule the heavy-ion community would need at least 40 billion minimum-bias recorded events to significantly develop the hadronic interaction program. An even greater impact could be achieved with 60 billion minimum-bias events in case the run is extended to 2026. Conversely, the UPC program would also need about 50 billion UPC events. This condition, convoluted with the low uptime provided by LHC in the 2023 PbPb run and the shortening of the HI run in 2024-2025 with respect to the original plan, imposes the need of collecting the entire hadronic data sample during the next Run 3 PbPb run. A rate of UPC events of up to 80 kHz should be obtained. During the 2023 PbPb run, which spanned for 5 weeks, roughly 15 billion minimum-bias events and 10 billion UPC events were recorded.

These requirements put extreme pressure on the CMS DAQ and its file transfer system. Considering an average hadronic event size of 0.6 MB/event and an event trigger rate of 50 kHz, this corresponds to 30 GB/s, just for the hadronic component. To be noted that sophisticated data reduction techniques and data compression have been already put in place to obtain a 54% event size reduction. A dedicated data taking strategy giving priority to hadronic collisions during the high luminosity part of the fill and to UPC collisions during the remainder of the fill is in place to reduce the stress into the system. Overall speaking, the heavy-ion physics program in its full strength demands about 30 GB/s of trigger rate, once the hard probes, minimum-bias, and UPC data event contributions are considered.

These are enormous requirements, well-beyond anything the original storage manager and transfer system was designed for. In the data flow that has been used so far during Run 2 and Run 3, events are selected by the high-level trigger on the Filter Units (FUs) and temporarily stored in small data files. Those small files are sent to the Builder Units (BUs), there are about 60 of them, and they are merged into a common file located in a large disk buffer global file system, “Lustre”. Then, the files are transferred from Lustre to Tier-0 at CERN. Note that events are written in files in the FUs in intervals of 23 seconds (so-called “luminosity sections”) and in different event types (so-called “streams”). The two main functions of this system are to allow data taking to continue even if the connectivity to Tier-0 at CERN fails, and to be able to write data for relatively short intervals at rates larger than the available bandwidth to the EOS storage at CERN. Given the current performance and capability of this system, the maximum trigger rate is about 18-19 GB/s, which is much smaller than the desired rates from the heavy-ion community.

The PPC team has therefore spent this summer developing a new completely parallel data flow. In this new approach, the individual small files are still created in the FUs. Nevertheless, instead of sending the files to all those 60 BUs, for a given luminosity section and stream, all files are sent to a RAM disk of a single machine. Then, the individual files are merged into a file located on a special disk on the same given machine. Finally, the merged file is transferred to Tier-0 at CERN.

While the main operations may look like similar between the two data flows, there are actually major differences between them, First of all, the merging of the files is not performed by 60 machines (i.e., the BUs) writing in parallel to a given location on Lustre, but a single machine performs that operation without any parallelism. The output merged file is not located anymore on Lustre, but on a hard drive from the individual machines. A third major difference is the fact that the new data flows does not allow for almost any delay in the chain: arrival of the files from the FUs, merging, and transfer operations must happen flawless, since the actual buffer size on the individual machines is relatively small, a few TBs, i.e., less than an hour of data-taking.

To meet the requirements, special Solid State Drives (SSDs) were bought so that the merging and transfer operations may happen as fast as possible. In particular, there are two SSDs per machine of about 7 TB space for each of them. Every of these machines are able to use the new data flow at rates of about 2 GB/s, and there are 8 of them. Therefore, we should be able to increase the data rate by about 15-16 GB/s. Another important aspect is that, given the small buffer size, the new data flow can not wait for Tier-0 to acknowledge the arrival of the files. Instead, as soon as the files are properly transferred on our side, they are deleted. Since we are using both data flows at the same time, we have had to implement a framework which works if either only one of the data flows or both of them are being used.

The new data flow is planned to be extensively used during the heavy-ion data taking in November (4-26). The above tests show that during the standard proton-proton collision data taking total rates of about 30 GB/s can be reached.

Leave a Reply

Your email address will not be published. Required fields are marked *