Remove Duplicates Preserving Order Practice Problem
This data science coding problem helps you practice Data Structures, remove duplicates preserving order, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Data Structures.
- Problem ID: 89
- Problem key: 89-remove-duplicates-preserving-order
- URL: https://datacrack.app/solve/89-remove-duplicates-preserving-order
- Difficulty: easy
- Topic: Data Structures
- Module: Python Fundamentals
Problem Statement
# 🧩 Remove Duplicates Preserving Order
---
### 🎯 Goal
Removing duplicates from a dataset while **preserving the original order** is a frequent data cleaning task. This problem teaches the tradeoff between using `set` (fast lookup, no order) vs `list` (ordered, slower lookup).
---
### 🔍 The Challenge
- `set(items)` removes duplicates but **loses the original order**
- We want duplicates removed **and** the first occurrence order maintained
```
Input: [3, 1, 4, 1, 5, 9, 2, 6, 5]
Output: [3, 1, 4, 5, 9, 2, 6]
```
---
### 💻 Task
Implement `remove_duplicates(items)` that preserves insertion order.
---
### 📥 Input
- `items`: list of elements (strings or numbers)
### 📤 Output
- A new list with duplicates removed, keeping the **first occurrence** of each element, **in original order**
---
### 🧩 Starter Code
```python
def remove_duplicates(items):
"""
Remove duplicates from a list while preserving the order of first occurrences.
Args:
items (list): Input list with possible duplicates
Returns:
list: Deduplicated list in original order
"""
seen = set()
result = []
for item in items:
# 🧠 TODO: If item is not in seen, add it to result and mark it as seen
# 🧠 TODO: If item is already in seen, skip it
pass
return result
```
---
### 💡 Example
```python
remove_duplicates([1, 2, 2, 3, 4, 4, 5])
# Expected: [1, 2, 3, 4, 5]
remove_duplicates(["a", "b", "a", "c", "b"])
# Expected: ["a", "b", "c"]
```
---
### 🔑 Key Concepts
- Use a `set` for O(1) lookup to track what's been seen
- Use a `list` for the ordered result
- Two data structures working together — a very common patternStarter Code
def remove_duplicates(items):
"""
Remove duplicates from a list while preserving the order of first occurrences.
Args:
items (list): Input list with possible duplicates
Returns:
list: Deduplicated list in original order
"""
seen = set()
result = []
for item in items:
# 🧠 TODO: If item is not in seen, add it to result and mark it as seen
# 🧠 TODO: If item is already in seen, skip it
pass
return result