Score based generative models are a new class of generative models that have been shown to accurately generate high dimensional calorimeter datasets. Recent advances in generative models have used images with 3D voxels to represent and model complex calorimeter showers. Point clouds, however, are likely a more natural representation of calorimeter showers, particularly in calorimeters with high granularity. Point clouds preserve all of the information of the original simulation, more naturally deal with sparse datasets, and can be implemented with more compact models and data files. In this work, two state-of-the-art score based models are trained on the same s[118;1:3uet of calorimeter simulation and directly compared.
Detector simulations are essential tools for data analysis by connecting particle and nuclear physics predictions to measurable quantities. However, the most precise detector simulations (usually based on GEANT 1) are computationally expensive. This is especially true for calorimeters, which are designed to stop most particles and thus require modeling interactions across multiple energy scales. If there was a way to build a fast simulation automatically and using the full detector dimensionality, then data analysis at existing and developing experiments could be greatly enhanced.
Deep learning (DL) has been used to build automated and high-dimensional fast simulations ('surrogate models') for calorimeters. Starting from Generative Adversarial Networks (GANs) and now including Diffusion Models 2, these methods have rapidly improved. However, nearly all proposed methods for DL-based calorimeter simulations are based on an image format (a fixed grid of pixels) 3. These data are unlike natural images in a number of ways, most notably in their sparsity.
Since most cells in a high-granularity calorimeter image are empty, a more natural representation of these data may be a point cloud 4. Point clouds are a set of attributes assigned to locations in space; in the calorimeter case, the attribute is energy and the location is the cell coordinates. A calorimeter point cloud would require far fewer numbers to specify than an image representation, since only cells with non-zero energy would be recorded.
For a fair comparison, two diffusion models (one image-based, one point-cloud based) are trained using the same score-matching strategy on representations of the same parent GEANT simulation. We simulate a high-granularity iron-scintillator calorimeter similar to the forward hadronic calorimeter planned for the ePIC detector at the Electron-Ion Collider.
Both models perform well for most distributions and show very promising classifier performance (AUCs) 5, deviating no more than 10% from the baseline at smaller deposited energies. However, the point cloud model offers several distinct advantages over the image model:
As calorimeters continue to increase in granularity, the advantages of point clouds, combined with further model optimizations, will likely make point cloud based models a clear choice for future detectors 8.