Variance and Standard Deviation Practice Problem

This data science coding problem helps you practice Mathematical & Statistical Operations, variance and standard deviation, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Mathematical & Statistical Operations.

Problem ID: 57
Problem key: 57-variance-and-standard-deviation
URL: https://datacrack.app/solve/57-variance-and-standard-deviation
Difficulty: easy
Topic: Mathematical & Statistical Operations
Module: NumPy Foundations

Problem Statement

# 🧩 Variance and Standard Deviation

---


### 🎯 Goal

Variance and standard deviation measure how spread out the values in an array are.  
In this NumPy problem, we compute them directly from the data and connect the formulas to `np.var()` and `np.std()`.

---

### 🔍 Formulas

For an array with `n` values:

$$
\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i
$$

$$
\text{variance} = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2
$$

$$
\text{standard deviation} = \sqrt{\text{variance}}
$$

| Symbol | Name | Meaning |
|:-------|:-----|:--------|
| $\bar{x}$ | Mean of the array | Average value in this data |
| $\text{variance}$ | Variance of the array | Average squared distance from the mean |
| $\text{standard deviation}$ | Standard deviation of the array | Typical distance from the mean in the original units |

---

### 💻 Task
Implement `compute_spread(data)` using NumPy.

---

### 📥 Input
- `data`: list of numbers

### 📤 Output
- dict with keys `"mean"`, `"variance"`, `"std"`

---

### 🧩 Starter Code
```python
import numpy as np

def compute_spread(data):
    """
    Compute mean, variance, and standard deviation of a dataset.

    Args:
        data (list): List of numbers

    Returns:
        dict: {"mean", "variance", "std"}
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO: np.mean(arr), np.var(arr), np.std(arr)
    pass
```

---

### 💡 Example
```python
compute_spread([2, 4, 4, 4, 5, 5, 7, 9])
# Expected: {"mean": 5.0, "variance": 4.0, "std": 2.0}
```

---

### 🔑 Key Concepts
- `np.var(arr)` uses **population variance** (divides by n) by default
- `np.var(arr, ddof=1)` uses **sample variance** (divides by n-1) — used when estimating from a sample
- For this problem, use population variance (default, `ddof=0`)
- `np.std(arr)` = `np.sqrt(np.var(arr))`

Starter Code

import numpy as np

def compute_spread(data):
    """
    Compute mean, variance, and standard deviation of a dataset.

    Args:
        data (list): List of numbers

    Returns:
        dict: {"mean", "variance", "std"}
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO: np.mean(arr), np.var(arr), np.std(arr)
    pass

Internal Links

Variance and Standard Deviation Practice Problem

Problem ID: 57
Problem key: 57-variance-and-standard-deviation
URL: https://datacrack.app/solve/57-variance-and-standard-deviation
Difficulty: easy
Topic: Mathematical & Statistical Operations
Module: NumPy Foundations

Problem Statement

# 🧩 Variance and Standard Deviation

---


### 🎯 Goal

Variance and standard deviation measure how spread out the values in an array are.  
In this NumPy problem, we compute them directly from the data and connect the formulas to `np.var()` and `np.std()`.

---

### 🔍 Formulas

For an array with `n` values:

$$
\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i
$$

$$
\text{variance} = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2
$$

$$
\text{standard deviation} = \sqrt{\text{variance}}
$$

| Symbol | Name | Meaning |
|:-------|:-----|:--------|
| $\bar{x}$ | Mean of the array | Average value in this data |
| $\text{variance}$ | Variance of the array | Average squared distance from the mean |
| $\text{standard deviation}$ | Standard deviation of the array | Typical distance from the mean in the original units |

---

### 💻 Task
Implement `compute_spread(data)` using NumPy.

---

### 📥 Input
- `data`: list of numbers

### 📤 Output
- dict with keys `"mean"`, `"variance"`, `"std"`

---

### 🧩 Starter Code
```python
import numpy as np

def compute_spread(data):
    """
    Compute mean, variance, and standard deviation of a dataset.

    Args:
        data (list): List of numbers

    Returns:
        dict: {"mean", "variance", "std"}
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO: np.mean(arr), np.var(arr), np.std(arr)
    pass
```

---

### 💡 Example
```python
compute_spread([2, 4, 4, 4, 5, 5, 7, 9])
# Expected: {"mean": 5.0, "variance": 4.0, "std": 2.0}
```

---

### 🔑 Key Concepts
- `np.var(arr)` uses **population variance** (divides by n) by default
- `np.var(arr, ddof=1)` uses **sample variance** (divides by n-1) — used when estimating from a sample
- For this problem, use population variance (default, `ddof=0`)
- `np.std(arr)` = `np.sqrt(np.var(arr))`

Starter Code

import numpy as np

def compute_spread(data):
    """
    Compute mean, variance, and standard deviation of a dataset.

    Args:
        data (list): List of numbers

    Returns:
        dict: {"mean", "variance", "std"}
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO: np.mean(arr), np.var(arr), np.std(arr)
    pass