Breast Cancer Regularized Logistic Regression Practice Problem
This data science coding problem helps you practice Regularization for Logistic Regression, breast cancer regularized logistic regression, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Regularization for Logistic Regression.
- Problem ID: 131
- Problem key: 131-breast-cancer-regularized-logistic-regression
- URL: https://datacrack.app/solve/131-breast-cancer-regularized-logistic-regression
- Difficulty: medium
- Topic: Regularization for Logistic Regression
- Module: Introduction to Machine Learning
Problem Statement
# 🧩 Breast Cancer Regularized Logistic Regression
---
### 🎯 Goal
Apply **regularized logistic regression** to a real-world classification task using the Breast Cancer dataset.
You will choose between **L1**, **L2**, and **Elastic Net** using a `penalty_type` parameter.
---
### 💻 Task
You need to build a complete classification pipeline on the **Breast Cancer** dataset.
Steps:
1. Load the Breast Cancer dataset using `sklearn.datasets.load_breast_cancer()`.
2. Separate the dataset into:
- `X`: feature matrix
- `y`: binary target labels
3. Standardize the full feature matrix using `StandardScaler`.
4. Choose the model based on `penalty_type`:
- `"l1"` → use `LogisticRegression(penalty='l1', C=C_param, solver='saga', max_iter=10000, random_state=42)`
- `"l2"` → use `LogisticRegression(penalty='l2', C=C_param, solver='saga', max_iter=10000, random_state=42)`
- `"elasticnet"` → use `LogisticRegression(penalty='elasticnet', C=C_param, solver='saga', l1_ratio=0.5, max_iter=10000, random_state=42)`
5. Train the selected model on the full standardized dataset.
6. Standardize the provided `X_test` samples using the same scaler.
7. Predict labels for the provided samples.
8. Return predictions as a list of integers.
---
### 📖 Background
The Breast Cancer dataset is a binary classification dataset.
It contains 569 samples, 30 numeric features, and a binary target (0 = malignant, 1 = benign).
Regularized logistic regression adds a penalty to control model weights:
| `penalty_type` | Model | Penalty | Main Effect |
| :--- | :--- | :--- | :--- |
| `"l2"` | L2 Logistic Regression | L2 | Shrinks weights smoothly |
| `"l1"` | L1 Logistic Regression | L1 | Can push some weights to zero |
| `"elasticnet"` | Elastic Net | L1 + L2 | Combines sparsity and smooth shrinkage |
In scikit-learn, `C` is the inverse regularization strength.
- Smaller `C` means stronger regularization.
- Larger `C` means weaker regularization.
For Elastic Net, use `l1_ratio=0.5`.
Use `solver='saga'` because it supports L1, L2, and Elastic Net.
---
### 📥 Input / 📤 Output
* **Input:**
* `X_test`: list of samples — raw, unstandardized features
* `penalty_type`: string — one of `"l1"`, `"l2"`, or `"elasticnet"`
* `C_param`: float — inverse regularization strength
* **Output:**
* List of predicted integer labels
---
### ⚠️ Important Notes
* Your function handles the entire pipeline internally.
* The `X_test` input represents new raw samples to predict after the model is trained.
* Fit the scaler on the full Breast Cancer dataset, then use the same scaler to transform `X_test`.
* Do not use train/test split in this problem.
* For Elastic Net, use `l1_ratio=0.5`.
---
### 🧩 Starter Code
```python
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
def breast_cancer_logistic_regression(X_test, penalty_type, C_param):
"""
Train a regularized logistic regression model on the Breast Cancer dataset.
Args:
X_test: raw test features
penalty_type: 'l1', 'l2', or 'elasticnet'
C_param: inverse regularization strength
Returns:
list: predicted integer labels
"""
# TODO: Implement the full pipeline
pass
```
---
### 💡 Example
```python
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
X_test_samples = data.data[:3].tolist()
predictions = breast_cancer_logistic_regression(X_test_samples, "l2", 1.0)
print(predictions)
```
Expected Output:
```python
[0, 0, 0]
```
---