Least Squared Error Fitting

A method of linear regression that aims to reduce the sum of the square of the errors.

The sum of the square of errors for $n$ data-points is notated as:

E^{2} = i = 1 \sum n ϵ_{i}^{2} = i = 1 \sum n (y_{i} - \overset{y}{^}_{i})^{2}

Bad notation

$E^{2}$ is not $(E)^{2}$ . This is just an example of shitty notation, because we should all know that:
$(a + b + c + \dots)^{2} \neq = a^{2} + b^{2} + c^{2}$
Similiarly:
$(E)^{2} \neq = E^{2}$

Using Linear Algebra

Recall that $\overset{y}{^} = a + b x$

$E^{2}$ can be written as:

E^{2} = ‖ y - A u ‖^{2}

where:

y = y_{1} y_{2} ⋮ y_{n} A = 11 ⋮ 1 x_{1} x_{2} ⋮ x_{n} u = [a b] A u = a + b x_{1} a + b x_{2} ⋮ a + b x_{n}

Then, $E^{2}$ is reduced whenever $A u$ is close to $y$ . Since we can’t actually change the values of $A$ , we will have to adjust $u$ (and through it, $a$ and $b$ ).

We do so by taking the column space of $A$ . Let $W$ be the column space of $A$ . We assume that all values are real, so $W$ is a subspace of $R^{2}$

W = {A v ∣ v \in R^{2}}

We also defined $W$ to be an inner product space, with it’s inner product being the dot product.

Then, our goal is to find a value of $u$ such that the distance between $A u$ and $y$ is minimised.

We can use the properties of orthogonal projection:

So the vector closest to $y$ in $W$ is $proj_{W} (y)$ . Let $A u = proj_{W} (y)$ . Another property of orthogonal projection can be used:

Let $w \in W$ (i.e. $w = A v, v \in R^{2}$ ). Then

⟨ w, y - proj_{W} (y)⟩ = 0

⟨ A v, y - A u ⟩ = 0

Since the inner product of $W$ is the dot product:

(A v) \cdot (y - A u) = 0

Now we use a property unique to the dot product: Matrix Formula

(A v)^{T} (y - A u) = 0

Then using properties of matrix transposition:

v^{T} A^{T} (y - A u) = 0

Remember that $v$ is any vector in $R^{2}$ . That means $v$ can also be $A^{T} (y - A u)$ . In that case, we can take the positivity of the inner product: 4. Positivity

⟨ v, A^{T} (y - A u)⟩ = 0 \leftrightarrow A^{T} (y - A u) = 0

(Since we just look at the case where $v = A^{T} (y - A u)$ )

A^{T} (y - A u) = 0

A^{T} y - A^{T} A u = 0

A^{T} A u = A^{T} y

And, if $A^{T} A$ is an invertible matrix, then:

u = (A^{T} A)^{- 1} A^{T} y

Questionably Accurate Notes

Explorer

Least Squared Error Fitting

Using Linear Algebra

Related Concepts

See Also:

Questionably Accurate Notes

Explorer

Least Squared Error Fitting

Using Linear Algebra §

Related Concepts

See Also:

Using Linear Algebra