Probabilistic View of Linear Regression and Least Squares
Linear regression is reframed as a probabilistic generative model in which outputs are generated by a linear function of inputs plus Gaussian noise, rather than as a purely geometric line-fitting exercise.
Likelihood Maximization, Log-Likelihood, and Optimization in Regression
The same regression problem can be approached geometrically or probabilistically; the probabilistic framing uses likelihood and log-likelihood as the optimization objective.
Regularization as Prior Beliefs About Regression Coefficients
When multiple weight vectors (w) fit the data equally well, our prior beliefs about plausible coefficient values help select among them, leading to regularization terms in the objective.
Gaussian Priors and L2 (Ridge) Regularization
A common assumption in regression is that each coefficient (w_i) is drawn from a zero-centered Gaussian distribution—most features have small effects, with larger weights increasingly unlikely.
Laplace Priors and L1 (Lasso) Regularization for Sparsity
In many domains—genomics, neuroscience, high-dimensional feature spaces—we expect only a handful of features to matter, with most coefficients exactly zero, motivating sparse models.