Polynomial Regression Practice Problem
This data science coding problem helps you practice Linear Regression, polynomial regression, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Linear Regression.
- Problem ID: 7
- Problem key: 7-polynomial-regression
- URL: https://datacrack.app/solve/7-polynomial-regression
- Difficulty: medium
- Topic: Linear Regression
- Module: Introduction to Machine Learning
Problem Statement
# π§© Polynomial Regression β Linear in Weights
---
### π― Goal
Understand how we can fit **nonlinear data** using **linear regression**,
by expressing the model as a polynomial that is **linear in weights**.
---
### π Introduction
In linear regression, we assume a straight-line relationship between $x$ and $y$:
$$
\hat{y} = w_0 + w_1 x
$$
But many real-world datasets are not linear.
For example, a dataset might follow a curved trend β something like:
$$
y \approx w_0 + w_1 x + w_2 x^2
$$
This looks nonlinear in $x$, but notice that itβs **still linear in the parameters** $w_0, w_1, w_2$.
Thatβs why it can still be solved using the **Normal Equation** β just like simple linear regression.
The key idea:
> Polynomial regression is nonlinear in the data, but linear in the weights.
---
### π Explanation of Symbols
| Symbol | Meaning | Shape / Type |
|:-------:|:--------|:-------------|
| **$x$** | Input values | $(n,)$ |
| **$y$** | Target values | $(n,)$ |
| **$p$** | Polynomial degree | integer |
| **$w_i$** | Weight for the $i$-th power of $x$ | scalar |
| **$\hat{y}$** | Predicted values | $(n,)$ |
---
### π₯ Input / π€ Output
- `x`: NumPy array of shape $(n,)$ β input values
- `y`: NumPy array of shape $(n,)$ β true targets
- `degree`: integer, degree of the polynomial
**Returns:**
- `weights`: NumPy array of shape $(degree + 1,)$
- `y_pred`: NumPy array of shape $(n,)$
---
### π» Task
You are given data points $(x, y)$ that follow a curved pattern.
You need to **fit a polynomial regression model** of degree `p` that minimizes the mean squared error.
Steps:
1. Construct the **design matrix** $X$ such that:
$$
X_{ij} = x_i^j, \quad j = 0, 1, \dots, p
$$
(the first column corresponds to $x^0 = 1$, which acts as the bias term).
2. Compute the weights using the **Normal Equation**:
$$
w = (X^T X)^{-1} X^T y
$$
3. Compute the predictions:
$$
\hat{y} = Xw
$$
4. Return both `weights` and `y_pred`.
This approach allows you to fit **curved data** while still using **linear algebra** methods.
---
### π§© Starter Code
```python
import numpy as np
def polynomial_regression(x, y, degree):
"""
Fit a polynomial regression model using the Normal Equation.
Args:
x (np.ndarray): Input values, shape (n,)
y (np.ndarray): Target values, shape (n,)
degree (int): Degree of the polynomial
Returns:
tuple: (weights, predictions)
"""
# TODO 1: Build the matrix X where X[i, j] = x[i] ** j
# TODO 2: Compute w = (X.T @ X)^(-1) @ X.T @ y
# TODO 3: Compute y_pred = X @ w
pass
---
```
### π‘ Example
```python
x = np.array([1, 2, 3])
y = np.array([2, 3, 5])
w, y_pred = polynomial_regression(x, y, degree=2)
print("Weights:", np.round(w, 3))
print("Predictions:", np.round(y_pred, 3))
```
**Expected Output:**
```
Weights: [ 2. -0.5 0.5]
Predictions: [2. 3. 5.]
```
---
### π§ Hint
To build the matrix $X$, you can use:
`np.vander(x, N=degree+1, increasing=True)`
This automatically generates all powers of $x$ from $x^0$ up to $x^p$.