Log Transformation Practice Problem
This data science coding problem helps you practice Feature Scaling & Transformation, log transformation, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Feature Scaling & Transformation.
- Problem ID: 34
- Problem key: 34-log-transformation
- URL: https://datacrack.app/solve/34-log-transformation
- Difficulty: medium
- Topic: Feature Scaling & Transformation
- Module: Data Cleaning
Problem Statement
# Log Transformation
### 🎯 Goal
Apply logarithmic transformations to reduce skewness in numeric data.
### 💻 Task
Implement `log_transform(data, method='log1p')` that:
1. Converts the input dictionary to a DataFrame
2. Applies the specified log method to all numeric columns:
- `'log1p'` → `np.log1p(x)` — natural log of (1 + x), safe for zeros
- `'log10'` → `np.log10(x)` — base-10 logarithm
- `'log2'` → `np.log2(x)` — base-2 logarithm
3. Returns the transformed DataFrame rounded to 2 decimal places
---
### 📥 Input
- `data`: A dictionary where keys are column names and values are lists of numeric data
- `method`: One of `'log1p'`, `'log10'`, or `'log2'`
### 📤 Output
- A pandas DataFrame with log-transformed values, rounded to 2 decimals
---
### 🧩 Starter Code
```python
import pandas as pd
import numpy as np
def log_transform(data, method='log1p'):
"""
Apply logarithmic transformation to reduce skewness.
Args:
data (dict): Input data as dictionary (from JSON)
method (str): 'log1p', 'log10', or 'log2'
Returns:
pd.DataFrame: DataFrame with log-transformed values rounded to 2 decimals
"""
# TODO: Convert the input dictionary to a DataFrame
# TODO: Apply the appropriate log function based on method
# TODO: Round the result to 2 decimal places and return
pass
```
---
### 💡 Examples
**Example 1:** log1p (safe for zeros)
```python
data = {"A": [0.0, 1.0, 2.0, 3.0, 4.0]}
log_transform(data, method='log1p')
```
```
A
0 0.00
1 0.69
2 1.10
3 1.39
4 1.61
```
**Example 2:** log10 (powers of 10)
```python
data = {"A": [1.0, 10.0, 100.0, 1000.0]}
log_transform(data, method='log10')
```
```
A
0 0.0
1 1.0
2 2.0
3 3.0
```
**Example 3:** log2 (powers of 2)
```python
data = {"A": [1.0, 2.0, 4.0, 8.0, 16.0]}
log_transform(data, method='log2')
```
```
A
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
```Starter Code
import pandas as pd
import numpy as np
def log_transform(data, method='log1p'):
"""
Apply logarithmic transformation to reduce skewness.
Args:
data (dict): Input data as dictionary (from JSON)
method (str): 'log1p', 'log10', or 'log2'
Returns:
pd.DataFrame: DataFrame with log-transformed values rounded to 2 decimals
"""
# TODO: Convert the input dictionary to a DataFrame
# TODO: Apply the appropriate log function based on method
# TODO: Round the result to 2 decimal places and return
pass