Robust Scaling Practice Problem
This data science coding problem helps you practice Feature Scaling & Transformation, robust scaling, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Feature Scaling & Transformation.
- Problem ID: 36
- Problem key: 36-robust-scaling
- URL: https://datacrack.app/solve/36-robust-scaling
- Difficulty: medium
- Topic: Feature Scaling & Transformation
- Module: Data Cleaning
Problem Statement
# Robust Scaling
### 🎯 Goal
Scale numeric columns using median and interquartile range (IQR), making the transformation robust to outliers.
### 💻 Task
Implement `robust_scale(data)` that:
1. Converts the input dictionary to a DataFrame
2. Applies the Robust Scaling formula to each numeric column: `(x - median) / IQR`
3. Computes IQR as Q3 (75th percentile) minus Q1 (25th percentile)
4. Returns the scaled DataFrame rounded to 2 decimal places
---
### 📥 Input
- `data`: A dictionary where keys are column names and values are lists of numeric data
### 📤 Output
- A pandas DataFrame with all numeric columns scaled using median/IQR, rounded to 2 decimals
---
### 🧩 Starter Code
```python
import pandas as pd
import numpy as np
def robust_scale(data):
"""
Scale numeric columns using median and IQR (robust to outliers).
Args:
data (dict): Input data as dictionary (from JSON)
Returns:
pd.DataFrame: DataFrame with robust-scaled values rounded to 2 decimals
"""
# TODO: Convert the input dictionary to a DataFrame
# TODO: Compute Q1 (25th percentile), Q3 (75th percentile), and median
# TODO: Calculate IQR = Q3 - Q1
# TODO: Apply (x - median) / IQR to each column
# TODO: Round and return
pass
```
---
### 💡 Examples
**Example 1:** Symmetric data
```python
data = {"A": [1.0, 2.0, 3.0, 4.0, 5.0]}
robust_scale(data)
```
```
A
0 -1.0
1 -0.5
2 0.0
3 0.5
4 1.0
```
**Example 2:** Scaled symmetric data
```python
data = {"X": [10.0, 20.0, 30.0, 40.0, 50.0]}
robust_scale(data)
```
```
X
0 -1.0
1 -0.5
2 0.0
3 0.5
4 1.0
```
**Example 3:** Data with an outlier
```python
data = {"A": [2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 100.0]}
robust_scale(data)
```
```
A
0 -0.88
1 -0.62
2 -0.38
3 -0.12
4 0.12
5 0.38
6 0.62
7 11.38
```Starter Code
import pandas as pd
import numpy as np
def robust_scale(data):
"""
Scale numeric columns using median and IQR (robust to outliers).
Args:
data (dict): Input data as dictionary (from JSON)
Returns:
pd.DataFrame: DataFrame with robust-scaled values rounded to 2 decimals
"""
# TODO: Convert the input dictionary to a DataFrame
# TODO: Compute Q1 (25th percentile), Q3 (75th percentile), and median
# TODO: Calculate IQR = Q3 - Q1
# TODO: Apply (x - median) / IQR to each column
# TODO: Round and return
pass