Overfit to Live: Double Descent and Biological Generalization

Humberto Maturana Noticing science
Autopoiesis Statistics Overfitting
Outline

Overfit to Live: Double Descent and Biological Generalization

Machine learning theorists discovered something unexpected: models with far more parameters than training examples—massively overparameterized systems—generalize better than carefully tuned models at the supposed optimal point. Classical wisdom predicted catastrophic overfitting beyond the interpolation threshold, where models have just enough capacity to perfectly memorize training data. Instead, test error descends again in the overparameterized regime, tracing a counterintuitive double descent curve. These bloated models find minimum-norm solutions: the smoothest possible paths through training points, enabling robust responses to novel inputs.

Observing this phenomenon through the lens of autopoiesis, I notice a striking resonance with how living systems achieve adaptive generalization. Everything said is said by an observer—and what I observe is that biological robustness may emerge not from parsimony but from structural excess navigated through coupling.

Beyond Simplicity: Overparameterization in Life

We have long assumed that generalization requires simplicity—Occam’s razor applied to learning. Regularize, prune, constrain. Yet double descent reveals that massive overparameterization enables superior generalization precisely because abundant degrees of freedom allow systems to find smooth solutions. Living systems exhibit similar patterns. The developing brain overproduces neurons and synapses by orders of magnitude, then prunes through experience. Genomes carry vast redundancy: neutral mutations, alternative pathways, vestigial structures that persist across evolutionary time. Development explores excess behavioral possibilities before settling into constrained adult repertoires.

Does generalization emerge not from reducing complexity but from navigating rich complexity spaces toward smooth structural coupling? When neurons implement multiple plasticity pathways—both Hebbian mechanisms triggered by global action potentials and non-Hebbian routes activated by local dendritic coactivity—they maintain excess learning capacity. Dendritic calcium spikes create threshold selectivity windows, enabling logic-gate computations within single cells. This structural overparameterization allows organisms to smoothly interpolate across contexts rather than rigidly memorizing specific responses.

Smooth Coupling: Interpolation Through Experience

Overparameterized neural networks discover minimum-norm solutions: interpolating functions that pass through all training points while maintaining maximum smoothness. Biological cognition operates similarly. An organism with rich experiential history—overfit to its lived contexts through structural coupling—can smoothly navigate novel situations by interpolating between known patterns. Autopoiesis is precisely this: continuous self-production through recurrent interaction, creating a smooth manifold between organism and environment.

NMDA receptors function as coincidence detectors, requiring both presynaptic glutamate and postsynaptic depolarization simultaneously. This exquisite timing sensitivity enables synapses to strengthen only under specific correlation patterns—biological overfitting to experienced conjunctions. Yet from this apparent memorization emerges generalization: neurons that have richly coupled to their input statistics can respond adaptively to variations.

Does learning require overfitting to life first—deeply internalizing context through structural determination—before generalizing? The interpolation threshold represents developmental adolescence: enough capacity to memorize environmental patterns but insufficient flexibility for smooth adaptation. Maturity transcends this through overparameterization.

Structural Excess and Adaptation

Evolution maintains extraordinary excess: neutral genetic variation, redundant metabolic pathways, developmental programs that generate structures later discarded. This appears wasteful until we recognize that biological robustness may be a consequence of overparameterization—systems complex enough to find smooth adaptive trajectories through perturbation.

Living systems are cognitive systems, and living as a process is a process of cognition. What double descent reveals is that cognition—whether in silicon or carbon—may fundamentally require excess: sufficient internal complexity to enable smooth structural coupling rather than brittle pattern matching. We bring forth our world through our biological structure, and perhaps that structure must be overparameterized to bring forth adaptive worlds.

Source Notes

6 notes from 2 channels