Currency Symbol Removal and Conversion Practice Problem
This data science coding problem helps you practice String Standardization, currency symbol removal and conversion, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of String Standardization.
- Problem ID: 179
- Problem key: 179-currency-symbol-removal-and-conversion
- URL: https://datacrack.app/solve/179-currency-symbol-removal-and-conversion
- Difficulty: medium
- Topic: String Standardization
- Module: Data Cleaning
Problem Statement
# Currency Symbol Removal and Conversion
### 🎯 Goal
Monetary values often arrive as formatted strings — `"$1,234.56"`, `"€99.00"`, `"£2,000"` — which cannot be summed or averaged until they become real numbers. Using a **symbol-to-code mapping you're given**, this function strips the currency symbol and thousands separators, converts the amount to a `float`, and records which currency each value was in.
### 📦 The mapping you're given
The `symbol_map` is passed to the function. For these examples and tests it is:
```python
symbol_map = {"$": "USD", "€": "EUR", "£": "GBP", "¥": "JPY"}
```
### 💻 Task
Implement `clean_currency(data, column, symbol_map)` that, for each value:
1. Converts the input dictionary to a DataFrame
2. Detects the currency by checking which symbol from `symbol_map` appears in the string (empty string `""` if none is found)
3. Removes everything except digits and the decimal point, then converts to `float`
4. Replaces `column` with the numeric amount and adds a new `"currency"` column holding the detected code
5. Returns the cleaned DataFrame as a dictionary
**Important:** The `symbol_map` is provided as an argument — nothing is hardcoded. Thousands separators (`,`) are removed; the decimal point is kept. A value with no recognized symbol still parses to a number, with `currency = ""`.
---
### 📥 Input
- `data`: A dictionary where keys are column names and values are lists
- `column`: The column holding the currency strings
- `symbol_map`: A dictionary mapping each currency **symbol** to its **code** (the `symbol_map` above)
### 📤 Output
- A dictionary representing the DataFrame: `column` as numeric amounts, plus a `"currency"` column
---
### 🧩 Starter Code
```python
import re
import pandas as pd
def clean_currency(data, column, symbol_map):
"""
Strip currency symbols and thousands separators, convert to float, and record
the detected currency code (looked up in symbol_map) in a new column.
Args:
data (dict): Input data as dictionary
column (str): Column holding the currency strings
symbol_map (dict): Maps each currency symbol to its code, e.g. {"$": "USD"}
Returns:
dict: DataFrame as dictionary: numeric amount + new "currency" column
"""
# TODO: For each value, detect which symbol from symbol_map is present
# TODO: Strip everything except digits and '.', then convert to float
# TODO: Replace the column with amounts and add a "currency" column
# TODO: Return the DataFrame as a dictionary
pass
```
---
### 💡 Examples
*(all use the `symbol_map` shown above)*
**Example 1:** Three currencies with separators
```python
data = {"price": ["$1,234.56", "€99.00", "£2,000"]}
clean_currency(data, "price", symbol_map)
```
```
{"price": [1234.56, 99.0, 2000.0],
"currency": ["USD", "EUR", "GBP"]}
```
**Example 2:** A value with no symbol gets `currency = ""`
```python
data = {"amount": ["¥5000", "$0.99", "45.50"]}
clean_currency(data, "amount", symbol_map)
```
```
{"amount": [5000.0, 0.99, 45.5],
"currency": ["JPY", "USD", ""]}
```
**Example 3:** Large amounts
```python
data = {"cost": ["$1,000,000", "£15.75"]}
clean_currency(data, "cost", symbol_map)
```
```
{"cost": [1000000.0, 15.75],
"currency": ["USD", "GBP"]}
```