Forward Fill and Backward Fill Practice Problem
This data science coding problem helps you practice Missing Data Handling, forward fill and backward fill, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Missing Data Handling.
- Problem ID: 25
- Problem key: 25-forward-fill-and-backward-fill
- URL: https://datacrack.app/solve/25-forward-fill-and-backward-fill
- Difficulty: easy
- Topic: Missing Data Handling
- Module: Data Cleaning
Problem Statement
# 🧩 Forward Fill and Backward Fill for Time-Series Data
---
### 🎯 Goal
In time-series data or ordered datasets, missing values often represent **continuity** of the previous or next value.
Two common propagation methods are:
- **Forward Fill (ffill)**: Propagate the last valid observation forward
- **Backward Fill (bfill)**: Propagate the next valid observation backward
---
### 🔍 When to Use Fill Methods?
| Method | Use Case | Example |
|:------:|:---------|:--------|
| **Forward Fill** | Value persists until it changes (e.g., sensor readings, stock prices) | Temperature sensor fails → assume same temp |
| **Backward Fill** | Future value is known and should apply retroactively | Filling gaps in weather forecast data |
**Visual Example (Forward Fill):**
```
Before: [1, NaN, NaN, 4, NaN, 6]
After: [1, 1, 1, 4, 4, 6]
↓ ↓ ↓ ↓
```
**Visual Example (Backward Fill):**
```
Before: [NaN, 2, NaN, 4, NaN, NaN]
After: [ 2, 2, 4, 4, NaN, NaN]
↑ ↑
```
---
### 📥 Input
- `df`: A pandas DataFrame with missing values
- `method`: String indicating fill method (`'ffill'` for forward fill, `'bfill'` for backward fill)
### 📤 Output
- A pandas DataFrame with missing values filled using the specified method
---
### 💻 Task
Implement a Python function `fill_forward_backward(df, method)` that:
1. Checks the fill method
2. Propagates values forward or backward to fill gaps
3. Returns the filled DataFrame
---
### 🧩 Starter Code
```python
import pandas as pd
import numpy as np
def fill_forward_backward(data, method='ffill'):
"""
Fill missing values using forward fill or backward fill.
Args:
data (dict): Input data as dictionary (from JSON)
method (str): 'ffill' (forward fill) or 'bfill' (backward fill)
Returns:
pd.DataFrame: DataFrame with missing values filled
"""
# 🧠 TODO: Convert the input dictionary to a DataFrame using pd.DataFrame(data)
# 🧠 TODO: Use df.fillna(method=method) or df.ffill() / df.bfill()
# 🧠 TODO: Return a copy of the DataFrame
pass
```
---
### 💡 Example 1: Forward Fill
```python
df = pd.DataFrame({
'value': [1.0, np.nan, np.nan, 4.0, np.nan, 6.0]
})
fill_forward_backward(df, method='ffill')
```
#### Expected Output
```python
value
0 1.0
1 1.0 # Filled from row 0
2 1.0 # Filled from row 0
3 4.0
4 4.0 # Filled from row 3
5 6.0
```
---
### 💡 Example 2: Backward Fill
```python
df = pd.DataFrame({
'value': [np.nan, 2.0, np.nan, 4.0, np.nan, np.nan]
})
fill_forward_backward(df, method='bfill')
```
#### Expected Output
```python
value
0 2.0 # Filled from row 1
1 2.0
2 4.0 # Filled from row 3
3 4.0
4 NaN # No future value to fill from
5 NaN
```
---
### 💡 Example 3: Multiple Columns
```python
df = pd.DataFrame({
'A': [1.0, np.nan, 3.0, np.nan],
'B': [np.nan, 5.0, np.nan, 7.0]
})
fill_forward_backward(df, method='ffill')
```
#### Expected Output
```python
A B
0 1.0 NaN # B has no previous value
1 1.0 5.0
2 3.0 5.0
3 3.0 7.0
```
---
### 🔑 Key Pandas Functions
- `df.fillna(method='ffill')`: Forward fill missing values
- `df.fillna(method='bfill')`: Backward fill missing values
- `df.ffill()`: Shorthand for forward fill (pandas ≥1.3)
- `df.bfill()`: Shorthand for backward fill (pandas ≥1.3)
---
### ⚠️ Important Notes
1. **Order Matters**: These methods assume rows are in meaningful order (e.g., time-sorted)
2. **Leading/Trailing NaNs**:
- Forward fill can't fill NaNs at the **start** (no previous value)
- Backward fill can't fill NaNs at the **end** (no future value)
3. **Not for All Data Types**: Works best for time-series, sequential, or ordered data
---