Gradient Descent vs Evolution | How Neural Networks Learn

Emergent Garden
Mar 1, 2025
5 notes
5 Notes in this Video

Parameter Space and Loss Landscapes for Neural Networks

ParameterSpace LossLandscape FunctionApproximation Optimization
01:30

Artificial neural networks with tunable parameters—weights and biases—act as universal function approximators that can represent many different input–output relationships depending on their parameter settings.

Why Gradient Descent Scales to Massive Neural Networks

GradientDescent SGD CurseOfDimensionality OptimizationScaling
13:00

Modern neural networks with thousands to billions of parameters rely on gradient-based optimizers—stochastic gradient descent and variants like Adam—to train on huge datasets within realistic compute budgets.

High-Dimensional Minima, Saddle Points, and the Curse of Dimensionality

SaddlePoints LocalMinima HighDimensionality OptimizationGeometry
20:00

Optimization algorithms navigating neural network loss landscapes confront geometric phenomena that look very different in high-dimensional spaces than in low-dimensional visualizations.

Limitations of Gradient Descent and Evolutionary Optimization

OptimizationLimits Differentiability EvolutionVsGradient AlgorithmTradeoffs
27:00

Both gradient-based and evolutionary algorithms are powerful but constrained tools for optimizing neural networks; neither fully captures the richness of biological evolution or satisfies all practical needs.