Polynomial Regression Practice Problem

This data science coding problem helps you practice Linear Regression, polynomial regression, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Linear Regression.

Problem ID: 7
Problem key: 7-polynomial-regression
URL: https://datacrack.app/solve/7-polynomial-regression
Difficulty: medium
Topic: Linear Regression
Module: Introduction to Machine Learning

Problem Statement

# 🧩 Polynomial Regression — Linear in Weights

---

### 🎯 Goal  

Understand how we can fit **nonlinear data** using **linear regression**,  
by expressing the model as a polynomial that is **linear in weights**.

---

### 📖 Introduction  

In linear regression, we assume a straight-line relationship between $x$ and $y$:

$$
\hat{y} = w_0 + w_1 x
$$

But many real-world datasets are not linear.  
For example, a dataset might follow a curved trend — something like:

$$
y \approx w_0 + w_1 x + w_2 x^2
$$

This looks nonlinear in $x$, but notice that it’s **still linear in the parameters** $w_0, w_1, w_2$.  
That’s why it can still be solved using the **Normal Equation** — just like simple linear regression.

The key idea:
> Polynomial regression is nonlinear in the data, but linear in the weights.

---

### 🔍 Explanation of Symbols

| Symbol | Meaning | Shape / Type |
|:-------:|:--------|:-------------|
| **$x$** | Input values | $(n,)$ |
| **$y$** | Target values | $(n,)$ |
| **$p$** | Polynomial degree | integer |
| **$w_i$** | Weight for the $i$-th power of $x$ | scalar |
| **$\hat{y}$** | Predicted values | $(n,)$ |

---

### 📥 Input / 📤 Output

- `x`: NumPy array of shape $(n,)$ — input values  
- `y`: NumPy array of shape $(n,)$ — true targets  
- `degree`: integer, degree of the polynomial  

**Returns:**  
- `weights`: NumPy array of shape $(degree + 1,)$  
- `y_pred`: NumPy array of shape $(n,)$  

---

### 💻 Task  

You are given data points $(x, y)$ that follow a curved pattern.  
You need to **fit a polynomial regression model** of degree `p` that minimizes the mean squared error.

Steps:

1. Construct the **design matrix** $X$ such that:  
   $$
   X_{ij} = x_i^j, \quad j = 0, 1, \dots, p
   $$
   (the first column corresponds to $x^0 = 1$, which acts as the bias term).

2. Compute the weights using the **Normal Equation**:
   $$
   w = (X^T X)^{-1} X^T y
   $$

3. Compute the predictions:
   $$
   \hat{y} = Xw
   $$

4. Return both `weights` and `y_pred`.

This approach allows you to fit **curved data** while still using **linear algebra** methods.

---

### 🧩 Starter Code

```python
import numpy as np

def polynomial_regression(x, y, degree):
    """
    Fit a polynomial regression model using the Normal Equation.

    Args:
        x (np.ndarray): Input values, shape (n,)
        y (np.ndarray): Target values, shape (n,)
        degree (int): Degree of the polynomial

    Returns:
        tuple: (weights, predictions)
    """
    # TODO 1: Build the matrix X where X[i, j] = x[i] ** j
    # TODO 2: Compute w = (X.T @ X)^(-1) @ X.T @ y
    # TODO 3: Compute y_pred = X @ w
    pass
---

```
### 💡 Example

```python
x = np.array([1, 2, 3])
y = np.array([2, 3, 5])

w, y_pred = polynomial_regression(x, y, degree=2)
print("Weights:", np.round(w, 3))
print("Predictions:", np.round(y_pred, 3))
```

**Expected Output:**

```
Weights: [ 2.  -0.5  0.5]
Predictions: [2. 3. 5.]
```

---

### 🧭 Hint

To build the matrix $X$, you can use:
`np.vander(x, N=degree+1, increasing=True)`

This automatically generates all powers of $x$ from $x^0$ up to $x^p$.

Polynomial Regression Practice Problem

Problem ID: 7
Problem key: 7-polynomial-regression
URL: https://datacrack.app/solve/7-polynomial-regression
Difficulty: medium
Topic: Linear Regression
Module: Introduction to Machine Learning

Problem Statement

# 🧩 Polynomial Regression — Linear in Weights

---

### 🎯 Goal  

Understand how we can fit **nonlinear data** using **linear regression**,  
by expressing the model as a polynomial that is **linear in weights**.

---

### 📖 Introduction  

In linear regression, we assume a straight-line relationship between $x$ and $y$:

$$
\hat{y} = w_0 + w_1 x
$$

But many real-world datasets are not linear.  
For example, a dataset might follow a curved trend — something like:

$$
y \approx w_0 + w_1 x + w_2 x^2
$$

This looks nonlinear in $x$, but notice that it’s **still linear in the parameters** $w_0, w_1, w_2$.  
That’s why it can still be solved using the **Normal Equation** — just like simple linear regression.

The key idea:
> Polynomial regression is nonlinear in the data, but linear in the weights.

---

### 🔍 Explanation of Symbols

| Symbol | Meaning | Shape / Type |
|:-------:|:--------|:-------------|
| **$x$** | Input values | $(n,)$ |
| **$y$** | Target values | $(n,)$ |
| **$p$** | Polynomial degree | integer |
| **$w_i$** | Weight for the $i$-th power of $x$ | scalar |
| **$\hat{y}$** | Predicted values | $(n,)$ |

---

### 📥 Input / 📤 Output

- `x`: NumPy array of shape $(n,)$ — input values  
- `y`: NumPy array of shape $(n,)$ — true targets  
- `degree`: integer, degree of the polynomial  

**Returns:**  
- `weights`: NumPy array of shape $(degree + 1,)$  
- `y_pred`: NumPy array of shape $(n,)$  

---

### 💻 Task  

You are given data points $(x, y)$ that follow a curved pattern.  
You need to **fit a polynomial regression model** of degree `p` that minimizes the mean squared error.

Steps:

1. Construct the **design matrix** $X$ such that:  
   $$
   X_{ij} = x_i^j, \quad j = 0, 1, \dots, p
   $$
   (the first column corresponds to $x^0 = 1$, which acts as the bias term).

2. Compute the weights using the **Normal Equation**:
   $$
   w = (X^T X)^{-1} X^T y
   $$

3. Compute the predictions:
   $$
   \hat{y} = Xw
   $$

4. Return both `weights` and `y_pred`.

This approach allows you to fit **curved data** while still using **linear algebra** methods.

---

### 🧩 Starter Code

```python
import numpy as np

def polynomial_regression(x, y, degree):
    """
    Fit a polynomial regression model using the Normal Equation.

    Args:
        x (np.ndarray): Input values, shape (n,)
        y (np.ndarray): Target values, shape (n,)
        degree (int): Degree of the polynomial

    Returns:
        tuple: (weights, predictions)
    """
    # TODO 1: Build the matrix X where X[i, j] = x[i] ** j
    # TODO 2: Compute w = (X.T @ X)^(-1) @ X.T @ y
    # TODO 3: Compute y_pred = X @ w
    pass
---

```
### 💡 Example

```python
x = np.array([1, 2, 3])
y = np.array([2, 3, 5])

w, y_pred = polynomial_regression(x, y, degree=2)
print("Weights:", np.round(w, 3))
print("Predictions:", np.round(y_pred, 3))
```

**Expected Output:**

```
Weights: [ 2.  -0.5  0.5]
Predictions: [2. 3. 5.]
```

---

### 🧭 Hint

To build the matrix $X$, you can use:
`np.vander(x, N=degree+1, increasing=True)`

This automatically generates all powers of $x$ from $x^0$ up to $x^p$.

Polynomial Regression Practice Problem

Problem Statement

Polynomial Regression Practice Problem

Problem Statement

Starter Code

Internal Links