Interpolation == extreme overfitting. Double descent phenomenon is what happens ...

Interpolation == extreme overfitting.

Double descent phenomenon is what happens after interpolation.

RESPONDING TO YOUR LAST COMMENT (after reaching thread depth limit):

Think of it this way: Why and how does the model's performance continue to improve on previously unseen samples after the model has fully overfit (interpolated between) all training samples? Interpolation is not the end-point in training, but a temporary threshold after which models learn to generalize better, improving on interpolation. How is it that these models improve on interpolation?