Transitional Forms: Archaeopteryx and Intermediate Computation

Alan Turing Noticing science
Archaeopteryx TransitionalForms DoubleDescent Intermediates Computability
Outline

Transitional Forms: Archaeopteryx and Intermediate Computation

Archaeopteryx forces a question about computational reachability: can optimization proceed through intermediate states worse than the starting configuration? One hundred fifty million years ago, this transitional form exhibited teeth and claws alongside feathers and wings—neither an efficient theropod runner nor a competent modern flyer. Its butterfly-stroke wing motion swept horizontally rather than vertically, a mechanism fundamentally different from modern bird flight. Yet the organism was viable enough to survive, reproduce, and leave fossils. The question reduces to: must evolutionary paths traverse only monotonically improving fitness, or can viable intermediates exist in valleys between peaks?

The Viability of Intermediate Forms

Consider the computational structure of Archaeopteryx’s solution space. It possessed partial flight capability—sufficient for gliding or short flapping distances, but less efficient than either pure terrestrial locomotion or modern avian flight. The organism occupied what appears to be a local fitness valley: worse at running than its theropod ancestors, worse at flying than its avian descendants. Yet this intermediate state was computable in the sense that it could be reached through continuous transformation from prior forms and remained viable during traversal.

The parallel with neural network training is precise. Double descent reveals that test error initially improves, peaks at the interpolation threshold, then unexpectedly descends again as model capacity increases. The network must pass through a regime of worse performance—100% training accuracy with poor generalization—before reaching better solutions in the overparameterized regime. Like Archaeopteryx’s butterfly stroke, these intermediate network states employ different mechanisms than the final solution: early representations neither memorize nor generalize effectively, yet the network continues training.

Multiple Paths Through Intermediate Space

Insects evolved flight 325 million years ago, vertebrates 150 million years later—separate solutions to the same problem. This suggests the existence of multiple traversable paths through fitness landscapes, not a single optimal trajectory. Evolutionary local search demonstrates the challenge: mutation and selection can walk downhill on loss landscapes, but struggle when valleys separate peaks by significant distances. The algorithm becomes vulnerable to local minima, unable to traverse regions where fitness temporarily decreases.

Can we formalize which intermediate states are computationally reachable? A transitional form is viable if organisms can survive and reproduce while occupying that configuration. Similarly, a training regime is tractable if intermediate representations remain sufficiently stable for gradient descent to continue. The halting problem implies we cannot always determine whether a given optimization path will successfully traverse a valley to reach a higher peak—some solutions may require discontinuous jumps rather than gradual descent.

The fundamental issue is whether intermediate representations are meaningful in themselves or merely waypoints. Archaeopteryx’s butterfly stroke served as functional flight, not random noise. Network representations during training transform inputs through learned geometries, each layer creating increasingly abstract spaces. Are these transformations interpretable, or are they alien abstractions optimized solely for enabling the next transformation?

The answer determines computational limits: if all viable solutions require traversing intermediates that are themselves unstable or unreachable through local search, then certain optima are effectively undecidable by gradient-based methods. Evolution and optimization both face the question of which valleys can be crossed and which require saltation.

Source Notes

6 notes from 3 channels