The thing you really should understand is, "Kemper did it first" is only true for the core philosophy - make a "clone" of a real amplifier.
Kemper do not do machine learning. They do not use neural networks. They do not use any kind of data-driven training process where a model learns the behaviour of the circuit from a dataset.
What Kemper actually does is a system identification process using test signals. It probes the amp with a series of stimuli and derives a DSP model that approximates the nonlinear response. It’s clever engineering, but it’s fundamentally a deterministic algorithm.
Modern systems like NeuralDSP QC captures, ToneX, NAM, etc. are doing something quite different. They train neural networks on large datasets of input/output pairs and let the model learn the transfer function. NeuralDSP even use their own robot (TINA) to help create the datasets; capturing training data from a reference amp, with the knobs at a range of values, and using that to create a parametric model rather than a single snapshot model. That's how they create their amp models.
So yes, Kemper absolutely popularised the idea of cloning an amp. But the underlying technology used by later systems is not just an iteration of Kemper’s method. In many cases it’s an entirely different class of modelling technique.