Cosine Similarity Practice Problem
This data science coding problem helps you practice Linear Algebra with NumPy, cosine similarity, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Linear Algebra with NumPy.
- Problem ID: 64
- Problem key: 64-cosine-similarity
- URL: https://datacrack.app/solve/64-cosine-similarity
- Difficulty: medium
- Topic: Linear Algebra with NumPy
- Module: NumPy Foundations
Problem Statement
# 🧩 Cosine Similarity
---
### 🎯 Goal
Cosine similarity measures the **angle** between two vectors — not their magnitude. It is the most widely used similarity metric in NLP (comparing word embeddings, document similarity), recommendation systems, and information retrieval. Two parallel vectors have similarity 1, perpendicular vectors have 0, and opposite vectors have -1.
---
### 🔍 The Formula
$$
\text{cosine\_similarity}(\mathbf{a}, \mathbf{b}) = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\| \cdot \|\mathbf{b}\|}
$$
| Symbol | Meaning |
|:-------|:--------|
| $\mathbf{a} \cdot \mathbf{b}$ | Dot product |
| $\|\mathbf{a}\|$ | L2 norm (magnitude): $\sqrt{a_1^2 + a_2^2 + \cdots}$ |
**Output range:** −1 (opposite) to +1 (identical direction)
---
### 💻 Task
Implement `cosine_similarity(a, b)` using `np.dot` and `np.linalg.norm`.
---
### 📥 Input
- `a`: list of numbers (vector)
- `b`: list of numbers (vector, same length as `a`)
### 📤 Output
- float — cosine similarity in [-1, 1]
---
### 🧩 Starter Code
```python
import numpy as np
def cosine_similarity(a, b):
"""
Compute the cosine similarity between two vectors.
Args:
a (list): First vector
b (list): Second vector
Returns:
float: Cosine similarity in [-1, 1]
"""
vec_a = np.array(a, dtype=float)
vec_b = np.array(b, dtype=float)
# 🧠 TODO: dot = np.dot(vec_a, vec_b)
# 🧠 TODO: norm_a = np.linalg.norm(vec_a), norm_b = np.linalg.norm(vec_b)
# 🧠 TODO: return dot / (norm_a * norm_b)
pass
```
---
### 💡 Example
```python
cosine_similarity([1, 0, 0], [1, 0, 0]) # Expected: 1.0 (same direction)
cosine_similarity([1, 0], [0, 1]) # Expected: 0.0 (perpendicular)
cosine_similarity([1, 1], [-1, -1]) # Expected: -1.0 (opposite)
```
---
### 🔑 Key Concepts
- `np.linalg.norm(v)` — L2 (Euclidean) norm: $\sqrt{\sum v_i^2}$
- The formula normalizes by magnitude — so direction, not length, determines the scoreStarter Code
import numpy as np
def cosine_similarity(a, b):
"""
Compute the cosine similarity between two vectors.
Args:
a (list): First vector
b (list): Second vector
Returns:
float: Cosine similarity in [-1, 1]
"""
vec_a = np.array(a, dtype=float)
vec_b = np.array(b, dtype=float)
# 🧠 TODO: dot = np.dot(vec_a, vec_b)
# 🧠 TODO: norm_a = np.linalg.norm(vec_a), norm_b = np.linalg.norm(vec_b)
# 🧠 TODO: return dot / (norm_a * norm_b)
pass