Ridge Regression (L2 Regularization) Practice Problem
This data science coding problem helps you practice Regularization for Linear Regression, ridge regression (l2 regularization), and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Regularization for Linear Regression.
- Problem ID: 130
- Problem key: 130-ridge-regression-l2-regularization
- URL: https://datacrack.app/solve/130-ridge-regression-l2-regularization
- Difficulty: hard
- Topic: Regularization for Linear Regression
- Module: Introduction to Machine Learning
Problem Statement
# 🧩 Ridge Regression (L2 Regularization)
---
### 🎯 Goal
Implement **Ridge Regression** — linear regression with an L2 penalty — using a **closed-form solution** (the regularized Normal Equation).
> Note: Implementing Ridge Regression means building the full linear regression solution with an L2 penalty — from the regularized loss, to the closed-form equation, to computing the best weights.
> Note: A closed-form solution means we take the derivative, set it equal to zero, and solve the equation directly. Since Ridge Regression gives us an equation we can solve, we do not need gradient descent here.
---
### 💻 Task
You are given input features $X$, targets $y$, and a regularization strength $\lambda$.
You need to compute the optimal weights using the **Ridge closed-form solution** and return predictions.
Steps:
1. Compute the Ridge weights using: $w = (X^TX + \lambda I)^{-1}X^Ty$
2. Compute predictions: $\hat{y} = Xw$
3. Return both `weights` and `y_pred`, each rounded to 6 decimal places.
---
### 🔍 Explanation of Symbols
| Symbol | Meaning | Shape / Type |
| :----: | :------ | :----------- |
| $X$ | Input feature matrix | $(N, d)$ |
| $y$ | Target values | $(N,)$ |
| $\lambda$ | Regularization strength | float |
| $I$ | Identity matrix | $(d, d)$ |
| $w$ | Learned weight vector | $(d,)$ |
---
### 📖 Background
In standard linear regression, the Normal Equation gives us:
$$w = (X^TX)^{-1}X^Ty$$
**Ridge Regression** adds a regularization term $\lambda I$ to stabilize the solution and shrink the weights:
$$w_{\text{ridge}} = (X^TX + \lambda I)^{-1}X^Ty$$
- When $\lambda = 0$, Ridge reduces to ordinary least squares.
- As $\lambda \to \infty$, all weights shrink toward zero.
- Ridge never sets weights exactly to zero — it **shrinks** them proportionally.
> Note: This implementation learns weights only.
> If you want an intercept/bias term, include a column of ones in `X`.
---
### 📥 Input / 📤 Output
* **Input:** `X`, `y`, `lambda_param`
* **Output:** Tuple `(weights, y_pred)` — each element rounded to 6 decimals
---
### 🧩 Starter Code
```python
import numpy as np
def ridge_regression(X, y, lambda_param):
"""
Fit Ridge Regression using the closed-form solution.
Args:
X: input features, shape (N, d)
y: target values, shape (N,)
lambda_param: regularization strength
Returns:
tuple: (weights, y_pred)
"""
# TODO: Implement the Ridge closed-form solution
pass
```
---
### 💡 Example
```python
X = [[1, 1], [2, 3], [3, 5]]
y = [2, 5, 8]
weights, y_pred = ridge_regression(X, y, 0.1)
print("Weights:", weights)
print("Predictions:", y_pred)
```
**Expected Output:**
```
Weights: [0.879927, 1.072411]
Predictions: [1.952337, 4.977085, 8.001833]
```
---
### 🧭 Hint
Use `np.eye(d)` to create the identity matrix and `np.linalg.inv()` for matrix inversion.
---