Bias-Variance Tradeoff Practice Problem

This data science coding problem helps you practice Model Generalization, bias-variance tradeoff, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Model Generalization.

Problem ID: 157
Problem key: 157-bias-variance-tradeoff
URL: https://datacrack.app/solve/157-bias-variance-tradeoff
Difficulty: medium
Topic: Model Generalization
Module: Introduction to Machine Learning

Problem Statement

# 🧩 Bias-Variance Tradeoff

---

### 🎯 Goal

Classify model behavior as **high bias**, **high variance**, or **good fit** using training and validation errors.

---

### 📖 Introduction

The bias-variance tradeoff explains two common ways a model can generalize poorly.

| Pattern | Training Error | Validation Error | Meaning |
|:--------|:---------------|:-----------------|:--------|
| **High bias** | high | high | model is too simple and underfits |
| **High variance** | low | much higher | model is too sensitive and overfits |
| **Good fit** | low | low | model generalizes well |

In this problem, lower error is better.

---

### 💻 Task

Implement `analyze_bias_variance`.

For each model:

1. Compute:

   $$
   \text{error gap} = \text{validation error} - \text{train error}
   $$

2. Use these rules:
   - `high_bias`: train and validation errors are both high
   - `high_variance`: train error is not high, but validation error is much higher than train error
   - `good_fit`: anything else

3. Round numeric values to 6 decimal places.

---

### 📥 Input / 📤 Output

**Input**
- `train_errors` (`list[float]`): training error for each model
- `validation_errors` (`list[float]`): validation error for each model
- `high_error_threshold` (`float`): errors above this value are considered high
- `gap_threshold` (`float`): validation error gap above this value is considered large

**Output**
- `list[dict]`: one diagnostic dictionary per model

Each dictionary should contain:
- `model_index`
- `train_error`
- `validation_error`
- `error_gap`
- `diagnosis`

---



### 🧩 Starter Code

```python
def analyze_bias_variance(train_errors, validation_errors, high_error_threshold=0.3, gap_threshold=0.1):
    """
    Diagnose model behavior from train and validation errors.
    """
    # TODO 1: Loop over train and validation errors
    # TODO 2: Compute validation-train error gap
    # TODO 3: Apply bias-variance diagnosis rules
    # TODO 4: Return one diagnostic dictionary per model
    pass
```

---

### 💡 Example

```python
analyze_bias_variance(
    train_errors=[0.42, 0.18, 0.05],
    validation_errors=[0.44, 0.20, 0.34]
)
```

**Expected Output**

```python
[
    {"model_index": 0, "train_error": 0.42, "validation_error": 0.44, "error_gap": 0.02, "diagnosis": "high_bias"},
    {"model_index": 1, "train_error": 0.18, "validation_error": 0.2, "error_gap": 0.02, "diagnosis": "good_fit"},
    {"model_index": 2, "train_error": 0.05, "validation_error": 0.34, "error_gap": 0.29, "diagnosis": "high_variance"}
]
```

---

### 🧭 Hint

High bias is about errors being too high overall. High variance is about the validation error being much worse than training error.

Bias-Variance Tradeoff Practice Problem

Problem ID: 157
Problem key: 157-bias-variance-tradeoff
URL: https://datacrack.app/solve/157-bias-variance-tradeoff
Difficulty: medium
Topic: Model Generalization
Module: Introduction to Machine Learning

Problem Statement

# 🧩 Bias-Variance Tradeoff

---

### 🎯 Goal

Classify model behavior as **high bias**, **high variance**, or **good fit** using training and validation errors.

---

### 📖 Introduction

The bias-variance tradeoff explains two common ways a model can generalize poorly.

| Pattern | Training Error | Validation Error | Meaning |
|:--------|:---------------|:-----------------|:--------|
| **High bias** | high | high | model is too simple and underfits |
| **High variance** | low | much higher | model is too sensitive and overfits |
| **Good fit** | low | low | model generalizes well |

In this problem, lower error is better.

---

### 💻 Task

Implement `analyze_bias_variance`.

For each model:

1. Compute:

   $$
   \text{error gap} = \text{validation error} - \text{train error}
   $$

2. Use these rules:
   - `high_bias`: train and validation errors are both high
   - `high_variance`: train error is not high, but validation error is much higher than train error
   - `good_fit`: anything else

3. Round numeric values to 6 decimal places.

---

### 📥 Input / 📤 Output

**Input**
- `train_errors` (`list[float]`): training error for each model
- `validation_errors` (`list[float]`): validation error for each model
- `high_error_threshold` (`float`): errors above this value are considered high
- `gap_threshold` (`float`): validation error gap above this value is considered large

**Output**
- `list[dict]`: one diagnostic dictionary per model

Each dictionary should contain:
- `model_index`
- `train_error`
- `validation_error`
- `error_gap`
- `diagnosis`

---



### 🧩 Starter Code

```python
def analyze_bias_variance(train_errors, validation_errors, high_error_threshold=0.3, gap_threshold=0.1):
    """
    Diagnose model behavior from train and validation errors.
    """
    # TODO 1: Loop over train and validation errors
    # TODO 2: Compute validation-train error gap
    # TODO 3: Apply bias-variance diagnosis rules
    # TODO 4: Return one diagnostic dictionary per model
    pass
```

---

### 💡 Example

```python
analyze_bias_variance(
    train_errors=[0.42, 0.18, 0.05],
    validation_errors=[0.44, 0.20, 0.34]
)
```

**Expected Output**

```python
[
    {"model_index": 0, "train_error": 0.42, "validation_error": 0.44, "error_gap": 0.02, "diagnosis": "high_bias"},
    {"model_index": 1, "train_error": 0.18, "validation_error": 0.2, "error_gap": 0.02, "diagnosis": "good_fit"},
    {"model_index": 2, "train_error": 0.05, "validation_error": 0.34, "error_gap": 0.29, "diagnosis": "high_variance"}
]
```

---

### 🧭 Hint

High bias is about errors being too high overall. High variance is about the validation error being much worse than training error.

Bias-Variance Tradeoff Practice Problem

Problem Statement

Bias-Variance Tradeoff Practice Problem

Problem Statement

Starter Code

Internal Links