L2-Regularized Logistic Regression Practice Problem
This data science coding problem helps you practice Regularization for Logistic Regression, l2-regularized logistic regression, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Regularization for Logistic Regression.
- Problem ID: 134
- Problem key: 134-l2-regularized-logistic-regression
- URL: https://datacrack.app/solve/134-l2-regularized-logistic-regression
- Difficulty: hard
- Topic: Regularization for Logistic Regression
- Module: Introduction to Machine Learning
Problem Statement
# 🧩 L2-Regularized Logistic Regression
---
### 🎯 Goal
Implement **Logistic Regression with an L2 penalty** from scratch using **gradient descent**. Unlike linear regression, logistic regression has no closed-form solution.
> Note: Implementing L2-regularized logistic regression means building the full training pipeline — from the regularized log loss, to gradients, to updating the weights and bias.
---
### 💻 Task
You are given input features $X$, binary targets $y$, regularization strength $\lambda$, number of iterations, and learning rate.
You need to train an L2-regularized logistic regression model using gradient descent.
Steps:
1. Initialize weights $w$ as a zero vector of shape $(d,)$ and bias $b = 0$.
2. For each iteration:
- Compute predicted probabilities: $p = \sigma(Xw + b)$
- Compute the prediction error: $p - y$
- Derive the gradient of the log loss with respect to $w$
- Add the L2 gradient: $\frac{\lambda}{N}w$
- Derive the gradient with respect to the bias $b$
- Update weights and bias using gradient descent
3. Compute final loss (log loss + L2 penalty).
4. Return `(weights, bias, loss)`, each rounded to 6 decimal places.
---
### 🔍 Explanation of Symbols
| Symbol | Meaning | Shape / Type |
| :----: | :------ | :----------- |
| $X$ | Input feature matrix | $(N, d)$ |
| $y$ | Binary target values | $(N,)$ |
| $\lambda$ | L2 regularization strength | float |
| $\eta$ | Learning rate | float |
| $w$ | Learned weight vector | $(d,)$ |
| $b$ | Learned bias | float |
---
### 📖 Background
Logistic regression predicts probabilities using the sigmoid function:
$$
p = \sigma(Xw + b)
$$
where:
$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$
The base loss is binary cross-entropy, also called log loss:
$$
L_{\text{base}} =
-\frac{1}{N}\sum_{i=1}^{N}
\left[
y_i\log(p_i) + (1-y_i)\log(1-p_i)
\right]
$$
L2 regularization adds a squared-weight penalty:
$$
P_{L2} = \frac{\lambda}{2N}\|w\|_2^2
$$
So the full loss is:
$$
L(w,b) = L_{\text{base}} + P_{L2}
$$
Key ideas:
- Logistic regression uses sigmoid probabilities.
- It uses log loss instead of MSE.
- L2 regularization shrinks weights smoothly.
- The bias $b$ is not regularized.
- There is no closed-form solution, so we use gradient descent.
---
### 📥 Input / 📤 Output
* **Input:** `X`, `y`, `lambda_param`, `iterations`, `learning_rate`
* **Output:** Tuple `(weights, bias, loss)` — weights as list rounded to 6 decimals, bias and loss as floats rounded to 6 decimals
---
### 🧩 Starter Code
```python
import numpy as np
def l2_logistic_regression(X, y, lambda_param, iterations, learning_rate):
"""
Fit L2-Regularized Logistic Regression using gradient descent.
Args:
X: input features, shape (N, d)
y: binary target values, shape (N,)
lambda_param: L2 regularization strength
iterations: number of gradient descent steps
learning_rate: step size
Returns:
tuple: (weights, bias, loss)
"""
# TODO: Implement L2-regularized logistic regression
pass
```
---
### 💡 Example
```python
X = [[1, 2], [3, 4], [5, 6], [7, 8]]
y = [0, 0, 1, 1]
weights, bias, loss = l2_logistic_regression(X, y, 0.1, 1000, 0.01)
print("Weights:", weights)
print("Bias:", bias)
print("Loss:", loss)
```
**Expected Output:**
```
Weights: [0.833749, -0.301538]
Bias: -1.297379
Loss: 0.359161
```
---
### 🧭 Hint
Use `np.clip` on sigmoid outputs to avoid `log(0)`. The bias gradient does not include the regularization term.
---