Fast simulations which can accurately model jet substructure are will be of utmost importance for boosted jet analyses at the HL-LHC. There has been significant development recently in generative models for accelerating LHC simulations, but less explored are methods for validating these simulations. We present a rigorous study on evaluation metrics, and discuss the novel Frechet and kernel physics distances as highly sensitive, quantitative metrics for validating not only ML, but potentially also traditional GEANT-based, simulations. We finally introduce our graph network and novel attention-based generative models, which have excellent qualitative and quantitative performance in generating LHC jets, as a case study for the use of these metrics.