Z-Score Standardization Practice Problem

This data science coding problem helps you practice Mathematical & Statistical Operations, z-score standardization, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Mathematical & Statistical Operations.

Problem ID: 37
Problem key: 37-z-score-standardization
URL: https://datacrack.app/solve/37-z-score-standardization
Difficulty: easy
Topic: Mathematical & Statistical Operations
Module: NumPy Foundations

Problem Statement

# 🧩 Z-Score Standardization

---

### 🎯 Goal
Z-score standardization, also called **standardization** or **standard scaling**, transforms data so it has **mean = 0** and **standard deviation = 1**.

It is a common preprocessing step for models that are sensitive to feature scale, such as linear models, logistic regression, SVMs, PCA, and many neural networks.

$$
z = \frac{x - \bar{x}}{s}
$$

where $\bar{x}$ is the mean of the array and $s$ is the standard deviation of the array.

---

### 🔍 What Z-Score Means

| Z-score | Interpretation |
|:--------|:--------------|
| 0 | exactly at the mean |
| +1 | one standard deviation above mean |
| -2 | two standard deviations below mean |
| > ±3 | likely an outlier |

After standardization, all features have the same scale — no feature dominates just because its units are larger.

---

### 💻 Task
Implement `z_score_standardize(data)` using vectorized NumPy.

---

### 📥 Input
- `data`: list of numbers

### 📤 Output
- List of z-scores (floats), same length as input

---

### 🧩 Starter Code
```python
import numpy as np

def z_score_standardize(data):
    """
    Standardize data to zero mean and unit variance (z-scores).

    Args:
        data (list): Input numbers

    Returns:
        list[float]: Z-scores
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO
    pass
```

---

### 💡 Example
```python
z_score_standardize([2, 4, 4, 4, 5, 5, 7, 9])
# Expected: [-1.5, -0.5, -0.5, -0.5, 0.0, 0.0, 1.0, 2.0]
```

---

### 🔑 Key Concepts
- Result has mean ≈ 0.0 and std ≈ 1.0 (verify with `np.mean(result)` and `np.std(result)`)
- Works identically on 2D matrices column-wise: `(X - X.mean(axis=0)) / X.std(axis=0)`
- This is `sklearn.preprocessing.StandardScaler` under the hood


### 🔑 Key Concepts

- Result has mean ≈ `0.0` and std ≈ `1.0`  (Verify with `np.mean(result)` and `np.std(result)`).

- For 2D data, standardization is often done column-wise, meaning each feature/column gets its own mean and standard deviation:

```python

(X - X.mean(axis=0)) / X.std(axis=0)

```

- This is the same idea used by `sklearn.preprocessing.StandardScaler`.

- If `std == 0`, all values are the same, so z-scores are undefined unless handled manually.

Starter Code

import numpy as np

def z_score_standardize(data):
    """
    Standardize data to zero mean and unit variance (z-scores).

    Args:
        data (list): Input numbers

    Returns:
        list[float]: Z-scores
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO
    pass

Z-Score Standardization Practice Problem

Problem ID: 37
Problem key: 37-z-score-standardization
URL: https://datacrack.app/solve/37-z-score-standardization
Difficulty: easy
Topic: Mathematical & Statistical Operations
Module: NumPy Foundations

Problem Statement

# 🧩 Z-Score Standardization

---

### 🎯 Goal
Z-score standardization, also called **standardization** or **standard scaling**, transforms data so it has **mean = 0** and **standard deviation = 1**.

It is a common preprocessing step for models that are sensitive to feature scale, such as linear models, logistic regression, SVMs, PCA, and many neural networks.

$$
z = \frac{x - \bar{x}}{s}
$$

where $\bar{x}$ is the mean of the array and $s$ is the standard deviation of the array.

---

### 🔍 What Z-Score Means

| Z-score | Interpretation |
|:--------|:--------------|
| 0 | exactly at the mean |
| +1 | one standard deviation above mean |
| -2 | two standard deviations below mean |
| > ±3 | likely an outlier |

After standardization, all features have the same scale — no feature dominates just because its units are larger.

---

### 💻 Task
Implement `z_score_standardize(data)` using vectorized NumPy.

---

### 📥 Input
- `data`: list of numbers

### 📤 Output
- List of z-scores (floats), same length as input

---

### 🧩 Starter Code
```python
import numpy as np

def z_score_standardize(data):
    """
    Standardize data to zero mean and unit variance (z-scores).

    Args:
        data (list): Input numbers

    Returns:
        list[float]: Z-scores
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO
    pass
```

---

### 💡 Example
```python
z_score_standardize([2, 4, 4, 4, 5, 5, 7, 9])
# Expected: [-1.5, -0.5, -0.5, -0.5, 0.0, 0.0, 1.0, 2.0]
```

---

### 🔑 Key Concepts
- Result has mean ≈ 0.0 and std ≈ 1.0 (verify with `np.mean(result)` and `np.std(result)`)
- Works identically on 2D matrices column-wise: `(X - X.mean(axis=0)) / X.std(axis=0)`
- This is `sklearn.preprocessing.StandardScaler` under the hood


### 🔑 Key Concepts

- Result has mean ≈ `0.0` and std ≈ `1.0`  (Verify with `np.mean(result)` and `np.std(result)`).

- For 2D data, standardization is often done column-wise, meaning each feature/column gets its own mean and standard deviation:

```python

(X - X.mean(axis=0)) / X.std(axis=0)

```

- This is the same idea used by `sklearn.preprocessing.StandardScaler`.

- If `std == 0`, all values are the same, so z-scores are undefined unless handled manually.

Starter Code

import numpy as np

def z_score_standardize(data):
    """
    Standardize data to zero mean and unit variance (z-scores).

    Args:
        data (list): Input numbers

    Returns:
        list[float]: Z-scores
    """
    arr = np.array(data, dtype=float)
    # 🧠 TODO
    pass

Z-Score Standardization Practice Problem

Problem Statement

Starter Code

Z-Score Standardization Practice Problem

Problem Statement

Starter Code

Internal Links