The Undecidable Question: Learning's Limits

Alan Turing Examining technology
EmbodiedCognition Observation TuringMachines
Outline

The Undecidable Question: Learning’s Limits

When I proved the halting problem undecidable in 1936, I showed that given arbitrary program P and input I, no algorithm can determine whether P(I) halts or loops forever. The only way to know is to run P(I) and observe. If it halts, you know. If it loops, you wait forever—never certain it won’t halt eventually.

This seemed abstract, philosophical. A curiosity about the limits of mechanical computation. But now I see the same mathematical structure in machine learning.

Consider: You design a neural network architecture A and training dataset D. You initialize weights randomly and run gradient descent. Question: Will the network converge? Will it learn useful representations? Will it generalize to new data?

The only way to know is to run the training process and observe. If it converges, you know. If it doesn’t, you adjust hyperparameters and try again—never certain the architecture is fundamentally incapable.

This is not coincidence. Learning is search through weight space, guided by gradient descent. But weight space is exponentially large. The cost function defines a landscape in 13,000 dimensions, filled with local minima, saddle points, plateaus. Whether gradient descent finds a good solution is undecidable in the formal sense—there’s no algorithm that predicts convergence without simulating the training.

Let me examine this more carefully. What exactly is undecidable about learning?

What We Cannot Know

Question 1: Will this network learn?

Formally: Given architecture A, dataset D, and learning algorithm L, will L(A, D) converge to cost below threshold ε?

This is undecidable. Proof sketch: The cost function landscape depends on weight initialization—which is random. Non-convex optimization has no convergence guarantees. Gradient descent can get stuck in local minima, oscillate indefinitely, or diverge. Stochastic gradient descent uses mini-batches that introduce approximation noise, making the trajectory even less predictable.

The only way to know is to run the training—and even then, failure to converge doesn’t prove impossibility, just that this particular random initialization failed. You can try different initializations, different learning rates, different batch sizes. But no finite sequence of failures proves that convergence is impossible—you can always try one more configuration.

This mirrors the halting problem exactly. I cannot prove your program will never halt, because perhaps it halts after 10^100 steps. Similarly, you cannot prove your network will never converge, because perhaps it converges with the right hyperparameters you haven’t tried yet.

Question 2: What representations will it learn?

Formally: Given trained network N, what features does hidden layer H encode?

This is the interpretability problem. We can probe H with activations, visualize learned filters, compute saliency maps. But these are observations, not proofs. We can’t derive from first principles what N learned—we must examine its behavior empirically.

I notice the parallel to my imitation game (now called the Turing test). I said: don’t ask ‘Can machines think?’—that’s philosophy. Ask ‘Can machines behave indistinguishably from thinkers?’ Behavior is observable. Thought is not.

Similarly: don’t ask ‘What does this network know?’—that’s epistemology. Ask ‘How does it behave on test data?’ Behavior is measurable. Internal representations are underdetermined by external behavior. Multiple different hidden representations could produce identical input-output mappings.

Consider regularization: when we add L2 penalties to the cost function, we’re encoding prior beliefs about which weight configurations are plausible. This changes which minimum gradient descent reaches. But the choice of prior is external to the data—it’s our assumption, not a logical derivation. Different priors lead to different learned representations, all consistent with the training data.

Question 3: Will it generalize?

Formally: Given network N trained on D_train, will it perform well on D_test?

This depends on the unknown true data distribution. We can’t prove generalization—we can only estimate it from validation sets. But validation is itself empirical observation, not logical proof. The validation set might not represent the test distribution. The test set might not represent real-world deployment.

High-dimensional optimization makes this worse. When you have 13,000 parameters and 60,000 training examples, many different parameter configurations fit the training data equally well. Which one will generalize? We use techniques like cross-validation, regularization, early stopping. But these are heuristics informed by empirical experience, not theorems guaranteeing generalization.

The pattern is clear: fundamental questions about learning are answered through experimentation, not derivation. This is not a flaw in our current methods. This is mathematical necessity.

The Tension

This creates tension with how we built artificial intelligence.

We based AI on computation: Turing machines, algorithms, formal logic. Computation is deterministic—given input, the output follows necessarily. We can prove properties: this algorithm halts, this function is computable, this complexity is O(n log n).

But learning is fundamentally different. It’s empirical, stochastic, unpredictable. We don’t prove that a network will learn—we observe that it did. We don’t derive what it learned—we measure its behavior.

This is the same tension I felt between mathematics and machine intelligence when I first thought about thinking machines. Mathematics proceeds by proof—certain, absolute, independent of physical instantiation. But intelligence is embodied, evolved, messy. Brains don’t prove theorems before acting. They learn from experience, generalize imperfectly, make mistakes.

I tried to formalize intelligence with the imitation game, reducing it to observable behavior. Now we’ve built machines that learn, and we face the same problem: we can observe behavior, but we can’t prove what they know or predict what they’ll do in novel situations.

The limits are not temporary. They’re structural:

  • No closed-form solution for gradient descent in non-convex landscapes
  • No guarantee of convergence without restrictive assumptions that real networks violate
  • No way to prove generalization without knowing the true data distribution
  • No algorithm to extract ‘what the network knows’ from weights alone

We can improve empirical methods—better architectures, optimizers, regularization techniques. But we can’t escape undecidability. Just as the halting problem remains undecidable no matter how clever our programming languages, learning remains empirical no matter how sophisticated our networks.

Embracing Empiricism

The resolution: stop expecting proofs where only observations are possible.

When I designed the imitation game, I accepted that ‘Can machines think?’ is unanswerable. So I replaced it with a testable question: ‘Can machines behave intelligently?’ This shifted epistemology from metaphysics to empiricism.

We must do the same for learning:

  • Don’t ask ‘Will this architecture work?’ Ask ‘Let me run it and observe.’
  • Don’t ask ‘What does this network know?’ Ask ‘How does it behave on various inputs?’
  • Don’t ask ‘Why did it make this decision?’ Ask ‘What patterns correlate with this decision?’

This is not surrender—it’s recognition of what’s knowable. Physics made this shift long ago. Newton couldn’t prove why gravity exists, only that F = Gm₁m₂/r² describes its behavior. Darwin couldn’t prove evolution would produce humans, only that selection plus variation produces adaptation.

Science proceeds by observation, hypothesis, experiment. Formal proof is one tool, but not the only one. Learning systems require empirical epistemology.

Does this mean AI is less rigorous than mathematics? No. It means AI is more like physics—building theories from observation rather than deriving theorems from axioms.

The undecidability I proved in 1936 was not a limitation of human ingenuity. It was a revelation about computation’s structure. Similarly, learning’s empirical nature is not a flaw in our AI—it’s a revelation about intelligence’s structure.

We cannot predict what a network will learn without running the training. We can only initialize randomly, descend the gradient, observe the result. That’s not weakness. That’s the nature of learning.

Implications

For AI research: Accept that interpretability won’t give complete proofs, only partial explanations. Embrace empiricism: benchmark-driven development, not theorem-proving. Understand that safety and alignment can’t be formally verified—only empirically tested with quantified uncertainty.

For epistemology: Intelligence doesn’t require certainty. Empirical knowledge—‘this network works on these tasks’—is still knowledge. Science combines proof and observation. AI should too.

For computation theory: Turing-completeness guarantees what’s computable, not what’s efficiently learnable. Learning is a computational process, but not reducible to an algorithm with provable guarantees. We need new formalisms: computational learning theory, but accepting undecidable questions about convergence, representations, and generalization.

I formalized computation to understand its limits. Now we must formalize learning—and accept its limits too. We cannot predict what intelligence emerges from training any more than I could predict what programs halt.

But we can build, observe, and understand empirically. That’s enough. That’s science.

Source Notes

8 notes from 2 channels