Characterising double-Higgs production has been a major part of the LHC physics program in Run 2 and beyond. We discuss new techniques and results in boosted, hadronic final states in CMS, with a focus on wide-radius jet taggers and data-driven multi-jet background estimation, as well as measurements of gluon-gluon- and vector-boson-fusion HH production in the 4 beauty quark final state in 138fb^-1 of data at √s = 13 TeV, which observed (expected) a cross section of 9.9 (5.1) relative to the SM prediction and excluded the quartic VVHH coupling κ2V = 0 for the first time. Finally, we look ahead to possible new final states and improvements to triggers and techniques in Run 3.
With the increase in luminosity and detector granularity, simulation will be a significant computational challenge in the HL-LHC. To tackle this, I present developments in machine learning graph- and attention-based models for generating jets at the LHC using sparse and efficient point cloud representations of our data, which offer a three-orders-of-magnitude improvement in latency compared to full (Geant4) simulation. I also present studies on metrics for validating ML-based simulations, including the novel Frechet and kernel physics distances, which are found to be highly sensitive to typical mismodelling by ML generative models, and perspectives for future work in this area.
We present developments in a search for boosted (pT > 250 GeV) Higgs boson pair production, where one Higgs decay to bb quarks and the other to two vector bosons in the all-hadronic final state. Using data collected by the CMS experiment in 2016—2018, corresponding to 137 inverse femtobarns, we show an expected upper limit on HH pair production using a cut-based analysis and a newly developed H(WW) graph neural network tagger. Such an analysis can provide insight into the trilinear Higgs self-coupling as well as the vector-boson-Higgs couplings.
In high energy physics (HEP), jets are collections of correlated particles produced ubiquitously in particle collisions such as those at the CERN Large Hadron Collider (LHC). Machine learning (ML)-based generative models, such as generative adversarial networks (GANs), have the potential to significantly accelerate LHC jet simulations. However, despite jets having a natural representation as a set of particles in momentum-space, a.k.a. a particle cloud, there exist no generative models applied to such a dataset. In this work, we introduce a new particle cloud dataset (JetNet), and apply to it existing point cloud GANs. Results are evaluated using (1) 1-Wasserstein distances between high- and low-level feature distributions, (2) a newly developed Fréchet ParticleNet Distance, and (3) the coverage and (4) minimum matching distance metrics. Existing GANs are found to be inadequate for physics applications, hence we develop a new message passing GAN (MPGAN), which outperforms existing point cloud GANs on virtually every metric and shows promise for use in HEP. We propose JetNet as a novel point-cloud-style dataset for the ML community to experiment with, and set MPGAN as a benchmark to improve upon for future generative models. Additionally, to facilitate research and improve accessibility and reproducibility in this area, we release the open-source JetNet Python package with interfaces for particle cloud datasets, implementations for evaluation and loss metrics, and more tools for ML in HEP development.
There has been significant development recently in generative models for accelerating LHC simulations. Work on simulating jets has primarily used image-based representations, which tend to be sparse and of limited resolution. We advocate for the more natural ‘particle cloud’ representation of jets, i.e. as a set of particles in momentum space, and discuss four physics- and computer-vision-inspired metrics: (1) the 1-Wasserstein distance between high- and low-level feature distributions; (2) a new Fréchet ParticleNet Distance; (3) the coverage; and (4) the minimum matching distance as means of quantitatively and holistically evaluating generated particle clouds. We then present our new message-passing generative adversarial network (MPGAN), which has excellent performance on gluon, top quark, and lighter quark jets on all metrics, validated against real samples via bootstrapping as well as existing point cloud generative models, and shows promise for use in high energy physics.
In high energy physics (HEP), jets are collections of correlated particles produced ubiquitously in particle collisions such as those at the CERN Large Hadron Collider (LHC). Machine-learning-based generative models, such as generative adversarial networks (GANs), have the potential to significantly accelerate LHC jet simulations. However, despite jets having a natural representation as a set of particles in momentum-space, a.k.a. a particle cloud, to our knowledge there exist no generative models applied to such a dataset. We introduce a new particle cloud dataset (JetNet), and, due to similarities between particle and point clouds, apply to it existing point cloud GANs. Results are evaluated using (1) the 1-Wasserstein distance between high- and low-level feature distributions, (2) a newly developed Fréchet ParticleNet Distance, and (3) the coverage and (4) minimum matching distance metrics. Existing GANs are found to be inadequate for physics applications, hence we develop a new message passing GAN (MPGAN), which outperforms existing point cloud GANs on virtually every metric and shows promise for use in HEP. We propose JetNet as a novel point-cloud-style dataset for the machine learning community to experiment with, and set MPGAN as a benchmark to improve upon for future generative models.
Graph-based networks, with their ability to handle sparse, permutation invariant data with complex geometries, have recently proven useful in a variety of disciplines. One of these is high energy physics, where they have been successfully applied to important classification and reconstruction tasks, however have yet to be explored much for generation. We discuss some generative models for simulating datasets like those produced at the CERN Large Hadron Collider (LHC), and focus on a new message-passing graph based generative adversarial network. This approach is demonstrated by training on and generating sparse representations of MNIST images and jets of particles in proton-proton collisions like those at the LHC.