Check Data Ranges Practice Problem
This data science coding problem helps you practice Data Consistency & Validation, check data ranges, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Data Consistency & Validation.
- Problem ID: 176
- Problem key: 176-check-data-ranges
- URL: https://datacrack.app/solve/176-check-data-ranges
- Difficulty: easy
- Topic: Data Consistency & Validation
- Module: Data Cleaning
Problem Statement
# Check Data Ranges
### 🎯 Goal
Real-world numeric columns frequently contain values that are simply not believable: an `age` of `130`, a `price` of `-3.5`, or a body `temperature` of `110`. Before any analysis, it is essential to flag whether each value sits inside a *reasonable* range.
This function checks every value in a column against an optional lower and upper bound and records the result in a new boolean column.
### 💻 Task
Implement `check_data_ranges(data, column, min_value=None, max_value=None)` that:
1. Converts the input dictionary to a DataFrame
2. Marks a value as **valid** when it is `>= min_value` (if given) **and** `<= max_value` (if given) — bounds are **inclusive**
3. Adds a new boolean column named `"<column>_valid"` holding the result
4. Returns the DataFrame as a dictionary
**Important:** If a bound is `None`, that side is not checked. With both bounds `None`, every value is valid.
---
### 📥 Input
- `data`: A dictionary where keys are column names and values are lists
- `column`: The numeric column to validate
- `min_value` *(optional)*: Inclusive lower bound
- `max_value` *(optional)*: Inclusive upper bound
### 📤 Output
- A dictionary representing the DataFrame with an added `"<column>_valid"` boolean column
---
### 🧩 Starter Code
```python
import pandas as pd
def check_data_ranges(data, column, min_value=None, max_value=None):
"""
Flag whether each value in a column falls within reasonable bounds.
Args:
data (dict): Input data as dictionary
column (str): Column to validate
min_value (numeric, optional): Lower bound (inclusive)
max_value (numeric, optional): Upper bound (inclusive)
Returns:
dict: DataFrame as dictionary with an added "<column>_valid" boolean column
"""
# TODO: Convert input dictionary to DataFrame
# TODO: Start with all rows assumed valid
# TODO: If min_value is provided, require value >= min_value
# TODO: If max_value is provided, require value <= max_value
# TODO: Store the result in a "<column>_valid" column
# TODO: Return DataFrame as dictionary
pass
```
---
### 💡 Examples
**Example 1:** Ages must be between 0 and 120
```python
data = {"age": [25, -5, 130, 45, 0]}
check_data_ranges(data, "age", min_value=0, max_value=120)
```
```
{"age": [25, -5, 130, 45, 0],
"age_valid": [True, False, False, True, True]}
```
**Example 2:** Prices only need a lower bound
```python
data = {"price": [19.99, -3.5, 0.0, 100.0]}
check_data_ranges(data, "price", min_value=0)
```
```
{"price": [19.99, -3.5, 0.0, 100.0],
"price_valid": [True, False, True, True]}
```
**Example 3:** Body temperature within a tight window
```python
data = {"temp": [98.6, 105.2, 95.0, 110.0]}
check_data_ranges(data, "temp", min_value=95, max_value=106)
```
```
{"temp": [98.6, 105.2, 95.0, 110.0],
"temp_valid": [True, True, True, False]}
```