Closed-Form Linear Regression Practice Problem
This data science coding problem helps you practice Linear Regression, closed-form linear regression, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Linear Regression.
- Problem ID: 1
- Problem key: 1-closed-form-linear-regression
- URL: https://datacrack.app/solve/1-closed-form-linear-regression
- Difficulty: hard
- Topic: Linear Regression
- Module: Introduction to Machine Learning
Problem Statement
# Closed-Form Linear Regression
---
### 🎯 Goal
You’ve learned how gradient descent updates weights step-by-step to minimize the Mean Squared Error (MSE).
Now you’ll discover a **direct mathematical shortcut** — the **Normal Equation**,
which computes the optimal weights *analytically* without iterations.
This problem helps you **derive** and **implement** the Normal Equation in code,
including the **intercept (bias)** term that allows the regression line to shift vertically.
---
### 🧠 Step 1 — Recall MSE in Scalar Form
The loss function (Mean Squared Error) for linear regression is:
$$
L = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2
$$
and since $ \hat{y}_i = w_1 x_{i1} + w_2 x_{i2} + \dots + w_d x_{id} + b $, we can write:
$$
L = \frac{1}{n}\sum_{i=1}^{n}(y_i - (x_i^\top w + b))^2
$$
---
### 🧮 Step 2 — Expressing it in Linear Algebra Form
Let’s represent all samples at once:
- $ X $ → the feature matrix of shape $(n, d)$
- $ y $ → the true labels of shape $(n, 1)$
- $ w $ → the weights of shape $(d, 1)$
- $ b $ → the bias (scalar)
Then, the predictions can be written as:
$$
\hat{y} = Xw + b
$$
To simplify the math, we **absorb the bias term** into the weight vector by
adding a column of ones to $ X $:
$$
X' = [\mathbf{1} \quad X]
$$
and define
$$
w' =
\begin{bmatrix}
b \\
w
\end{bmatrix}
$$
Now the prediction becomes:
$$
\hat{y} = X' w'
$$
and the total loss in matrix form is:
$$
L = \frac{1}{n}(y - X'w')^\top (y - X'w')
$$
This is the **vectorized form of MSE**, including the intercept term.
---
### ✏️ Step 3 — Derive the Optimal Weights
Now derive $ w' $ that minimizes $ L $.
Steps to follow:
1. Expand $ L = (y - X'w')^\top (y - X'w') $
2. Take the derivative with respect to $ w' $
3. Set the derivative equal to zero
4. Solve for $ w' $
When you do this correctly, you qill obtain an expression for $ w' $
that directly gives the optimal bias and weights using only matrix operations.
This is known as the **Normal Equation** —
the closed-form solution for linear regression that computes both the bias and the feature weights in one step.
---
### 💻 Step 4 — Implement the Closed-Form Solution
Now implement a Python function that computes $ w' $ directly using NumPy.
---
### 📥 Input / 📤 Output
**Input:**
- `X`: NumPy array of shape $(n, d)$ — feature matrix
- `y_true`: NumPy array of shape $(n,)$ — target values
**Output:**
- `w_full`: NumPy array of shape $(d + 1,)$ — first element is bias $b$, the rest are weights
---
### 🧩 Starter Code
```python
import numpy as np
def normal_equation(X, y_true):
"""
Compute the optimal bias and weights for linear regression
using the Closed-Form (Normal Equation) solution.
"""
# TODO: Add a column of ones to X for the bias term
# TODO: Implement the Normal Equation formula
pass
````
---
### 💡 Example + Expected Output
```python
X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([5, 11, 17])
w_full = normal_equation(X, y)
print(w_full)
```
**Expected Output:**
```
[1. 1. 2.]
```
Here, the first element (1.0) is the bias $b$,
and the remaining values (1.0, 2.0) are the feature weights $w_1, w_2$.
---
### ✅ What You’ll Learn
* How to include the intercept (bias) in matrix-based linear regression
* How to convert scalar MSE into linear algebra form
* How to derive the closed-form solution and implement it using NumPy
* How the Normal Equation provides a one-step, analytical alternative to gradient descent
---