Percentiles and Quantiles Practice Problem
This data science coding problem helps you practice Mathematical & Statistical Operations, percentiles and quantiles, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Mathematical & Statistical Operations.
- Problem ID: 56
- Problem key: 56-percentiles-and-quantiles
- URL: https://datacrack.app/solve/56-percentiles-and-quantiles
- Difficulty: medium
- Topic: Mathematical & Statistical Operations
- Module: NumPy Foundations
Problem Statement
# 🧩 Percentiles and Quantiles
---
### 🎯 Goal
Percentiles and quantiles are essential tools in **Exploratory Data Analysis (EDA)** and outlier detection. The 25th, 50th, and 75th percentiles form the **five-number summary** and the **box plot**. The **IQR (Interquartile Range)** is used to identify outliers via the 1.5×IQR rule.
---
### 🔍 Key Concepts
| Metric | Formula | Meaning |
|:-------|:--------|:--------|
| P25 (Q1) | `np.percentile(arr, 25)` | 25% of values fall below this |
| P50 (median) | `np.percentile(arr, 50)` | Middle value |
| P75 (Q3) | `np.percentile(arr, 75)` | 75% of values fall below this |
| IQR | Q3 − Q1 | Spread of the middle 50% |
**Outlier detection using IQR:**
- Lower fence: Q1 − 1.5 × IQR
- Upper fence: Q3 + 1.5 × IQR
- Points outside these fences are flagged as outliers
---
### 💻 Task
Implement `compute_percentiles(data)` using `np.percentile()`.
---
### 📥 Input
- `data`: list of numbers
### 📤 Output
- dict with keys `"p25"`, `"p50"`, `"p75"`, `"iqr"`
---
### 🧩 Starter Code
```python
import numpy as np
def compute_percentiles(data):
"""
Compute the 25th, 50th, 75th percentiles and IQR.
Args:
data (list): Input numbers
Returns:
dict: {"p25", "p50", "p75", "iqr"}
"""
arr = np.array(data, dtype=float)
# 🧠 TODO
pass
```
---
### 💡 Example
```python
compute_percentiles([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Expected: {"p25": 3.25, "p50": 5.5, "p75": 7.75, "iqr": 4.5}
```
---
### 🔑 Key Concepts
- `np.percentile(arr, 50)` = median
- `np.percentile(arr, [25, 50, 75])` computes all three at once
- IQR = P75 - P25 — measures spread that is robust to outliersStarter Code
import numpy as np
def compute_percentiles(data):
"""
Compute the 25th, 50th, 75th percentiles and IQR.
Args:
data (list): Input numbers
Returns:
dict: {"p25", "p50", "p75", "iqr"}
"""
arr = np.array(data, dtype=float)
# 🧠 TODO
pass