Double descent phenomenon is what happens after interpolation.
--
RESPONDING TO YOUR LAST COMMENT (after reaching thread depth limit):
Think of it this way: Why and how does the model's performance continue to improve on previously unseen samples after the model has fully overfit (interpolated between) all training samples? Interpolation is not the end-point in training, but a temporary threshold after which models learn to generalize better, improving on interpolation. How is it that these models improve on interpolation?
Double descent phenomenon is what happens after interpolation.
--
RESPONDING TO YOUR LAST COMMENT (after reaching thread depth limit):
Think of it this way: Why and how does the model's performance continue to improve on previously unseen samples after the model has fully overfit (interpolated between) all training samples? Interpolation is not the end-point in training, but a temporary threshold after which models learn to generalize better, improving on interpolation. How is it that these models improve on interpolation?