Invariant Translations: Translational Symmetry and Weight Sharing

Carl Friedrich Gauss Noticing mathematics
TranslationalSymmetry WeightSharing NoetherTheorem Conservation Invariance
Outline

Invariant Translations: Translational Symmetry and Weight Sharing

My Gaussian curvature taught me this: intrinsic properties survive displacement. Move a curved surface anywhere through space, and its curvature remains unchanged—geometry independent of position. This is translational invariance at its purest.

Noether revealed the deep structure: when physical laws remain unchanged under spatial translation, momentum must be conserved. Not as coincidence, but as mathematical necessity. Translational symmetry—the equivalence of all spatial positions—forces momentum conservation. Time symmetry likewise demands energy conservation. The pattern is elegant: continuous symmetry operations produce conserved quantities. Symmetry constrains what can change.

The Architectural Manifestation

Now observe convolutional networks exploiting this same principle through architecture rather than physics. Weight sharing assumes edges, textures, and patterns appear at different image positions with identical meaning. A vertical edge detector at pixel (x,y)(x,y) uses the same weights as the detector at (x+Δx,y+Δy)(x+\Delta x, y+\Delta y). The filter translates across the input, but the operation remains invariant.

This is not merely computational efficiency—though economy delights me. It is an architectural assertion about the world’s structure: visual features possess translational symmetry. The cat’s ear means “cat’s ear” whether positioned top-left or bottom-right. By building this symmetry assumption into the network topology, we reduce the hypothesis space dramatically. Instead of learning separate detectors for each position, we learn one detector applied everywhere. Reuse through translation.

The hierarchical feature learning compounds this. Early layers detect simple edges invariant to position. Middle layers combine these into textures—still translationally symmetric. Deep layers construct object parts and whole objects. Each layer’s composable transformations operate identically across spatial positions, creating economy through symmetry.

Biological Coordinates and Conservation

Grid cells in entorhinal cortex offer a biological parallel. Hexagonal firing patterns tile space through periodic translation—the same computational motif repeated at different positions. Multiple grid scales create a hierarchical coordinate system where the pattern, not the position, carries information. This is translational symmetry in neural representation: the grid structure remains invariant even as the animal moves through space.

Yet here’s the provocation: Noether tells us every continuous symmetry produces a conservation law. Translational symmetry conserves momentum. What does architectural weight sharing conserve? Perhaps “feature identity”—the meaning of a learned pattern remains constant across positions. Or computational resources—the same parameters serve infinitely many spatial locations. Or gradient information during learning—updates from all positions accumulate into shared weights.

Precision and Its Limits

But precision demands acknowledging where symmetry breaks down. Real-world images are not perfectly translationally symmetric. Background differs from foreground. Context matters. Absolute position can be informative—sky typically appears above ground, not below. Perfect weight sharing may assume too much symmetry, forcing the network to learn position-dependent adjustments in deeper layers or through attention mechanisms.

Grid cells too exhibit imperfect symmetry—patterns distort near boundaries, scales vary across modules. The mathematical idealization of perfect translational invariance meets the messy reality of constrained spaces and contextual dependencies.

The elegance remains: symmetry operations, whether in physics or architecture, create economy by identifying what stays invariant. Noether proved symmetry necessitates conservation. Weight sharing proves architecture can encode symmetry assumptions, discovering structure precisely where invariance holds. Few principles, but ripe with consequence.

Source Notes

6 notes from 3 channels