Regularized Log Loss Function Practice Problem
This data science coding problem helps you practice Regularization for Logistic Regression, regularized log loss function, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Regularization for Logistic Regression.
- Problem ID: 135
- Problem key: 135-regularized-log-loss-function
- URL: https://datacrack.app/solve/135-regularized-log-loss-function
- Difficulty: medium
- Topic: Regularization for Logistic Regression
- Module: Introduction to Machine Learning
Problem Statement
# 🧩 Regularized Log Loss Function
---
### 🎯 Goal
Understand how **regularization** modifies the standard **log loss** (binary cross-entropy) by adding a **penalty term** that discourages large weights, helping prevent **overfitting** in logistic regression.
---
### 💻 Task
You are given input features $X$, binary targets $y$, a weight vector $w$ (`weights` in code), a bias $b$, a regularization type (`l1` or `l2`), and a regularization strength $\lambda$.
Steps:
1. Compute predicted probabilities needed for log loss using the sigmoid function: $p = \sigma(Xw + b)$
2. Clip predictions to avoid log(0): `np.clip(p, 1e-15, 1 - 1e-15)`
3. Compute the **base log loss**: $L_{\text{base}} = -\frac{1}{N}\sum_{i=1}^{N}[y_i \log(p_i) + (1 - y_i) \log(1 - p_i)]$
4. Compute the **penalty**:
- L2: $P = \frac{\lambda}{2N}\sum_j w_j^2$
- L1: $P = \frac{\lambda}{N}\sum_j |w_j|$
5. Compute the **total loss**: $L_{\text{total}} = L_{\text{base}} + P$
6. Return the tuple `(base_loss, penalty, total_loss)`, each rounded to 6 decimal places.
---
### 🔍 Explanation of Symbols
| Symbol | Meaning | Shape / Type |
| :----: | :------ | :----------- |
| $X$ | Input feature matrix | $(N, d)$ |
| $y$ | Binary target values (0 or 1) | $(N,)$ |
| $w$ | Weight vector | $(d,)$ |
| $b$ | Bias (scalar) | float |
| $\lambda$ | Regularization strength | float |
| $N$ | Number of samples | integer |
| $d$ | Number of features | integer |
---
### 📖 Background
In logistic regression, we use the **sigmoid function** to produce probabilities:
$$\sigma(z) = \frac{1}{1 + e^{-z}}$$
The standard loss is **binary cross-entropy (log loss)**:
$$L_{\text{base}} = -\frac{1}{N}\sum_{i=1}^{N}[y_i \log(p_i) + (1 - y_i) \log(1 - p_i)]$$
However, models with many features can **overfit**. **Regularization** adds a penalty to discourage large weights:
- **L2 (Ridge):** $P = \frac{\lambda}{2N}\|w\|_2^2$ → shrinks weights smoothly toward zero
- **L1 (Lasso):** $P = \frac{\lambda}{N}\|w\|_1$ → can push weights exactly to zero (feature selection)
> Note: The bias $b$ is **not** regularized.
---
### 📥 Input / 📤 Output
* **Input:** `X`, `y`, `weights`, `bias`, `reg_type` (`'l1'` or `'l2'`), `lambda_param`
* **Output:** Tuple `(base_loss, penalty, total_loss)` — each a float rounded to 6 decimals
---
### 🧩 Starter Code
```python
import numpy as np
def compute_regularized_log_loss(X, y, weights, bias, reg_type, lambda_param):
"""
Compute the regularized log loss = base log loss + penalty.
Args:
X: input features, shape (N, d)
y: binary target values, shape (N,)
weights: weight vector, shape (d,)
bias: bias term (scalar)
reg_type: 'l1' or 'l2'
lambda_param: regularization strength
Returns:
tuple: (base_loss, penalty, total_loss)
"""
# TODO: Compute base log loss, penalty, and total loss
pass
```
---
### 💡 Example
```python
X = [[1, 2], [3, 4], [5, 6]]
y = [1, 0, 1]
weights = [0.1, -0.2]
result = compute_regularized_log_loss(X, y, weights, 0.0, 'l2', 0.1)
print(result)
```
**Expected Output:**
```
(0.810539, 0.000833, 0.811373)
```
---