Weighted Sampling Practice Problem
This data science coding problem helps you practice Random Sampling & Generators, weighted sampling, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Random Sampling & Generators.
- Problem ID: 120
- Problem key: 120-weighted-sampling
- URL: https://datacrack.app/solve/120-weighted-sampling
- Difficulty: easy
- Topic: Random Sampling & Generators
- Module: NumPy Foundations
Problem Statement
# 🧩 Weighted Sampling
---
### 🎯 Goal
Often, we don't want every item to have an equal chance of being selected. For example, a biased coin or an unbalanced roulette wheel. `np.random.choice` allows you to pass a `p` (probability) array to assign specific weights to each item in the data.
---
### 🔍 Weighted Sampling with `p`
```python
np.random.seed(42)
choices = ['Win', 'Lose']
probs = [0.01, 0.99] # 1% chance to win
result = np.random.choice(choices, size=5, p=probs)
print(result) # ['Lose', 'Lose', 'Lose', 'Lose', 'Lose']
```
---
### 💻 Task
Implement `weighted_sample(data, weights, size, seed)` that samples `size` elements from `data` based on their corresponding `weights`.
---
### 📥 Input
- `data`: list of values to sample from
- `weights`: list of probabilities (floats) matching `data` length, summing to 1.0
- `size`: int — number of samples to draw
- `seed`: int — random seed
### 📤 Output
- A list of the sampled items.
---
### 🧩 Starter Code
```python
import numpy as np
def weighted_sample(data, weights, size, seed):
"""
Sample elements with custom probabilities.
"""
# 🧠 TODO: Set the seed
# 🧠 TODO: Use np.random.choice with the p parameter
# 🧠 TODO: Return the result as a python list
pass
```
---
### 💡 Expected Output
```python
weighted_sample(['A', 'B', 'C'], [0.1, 0.8, 0.1], 5, 42)
# Expected: ['B', 'C', 'B', 'B', 'B']
```
---
### 🔑 Key Concepts
- The `p` argument in `np.random.choice` maps 1-to-1 with the `a` (data) argument.
- Probabilities in `p` must sum to exactly 1.0.
- Weighted sampling is heavily used in Markov chains and simulating real-world biased distributions.
Starter Code
import numpy as np
def weighted_sample(data, weights, size, seed):
"""
Sample elements with custom probabilities.
"""
# 🧠 TODO: Set the seed
# 🧠 TODO: Use np.random.choice with the p parameter
# 🧠 TODO: Return the result as a python list
pass