Diabetes Regularized Linear Regression Practice Problem
This data science coding problem helps you practice Regularization for Linear Regression, diabetes regularized linear regression, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Regularization for Linear Regression.
- Problem ID: 126
- Problem key: 126-diabetes-regularized-linear-regression
- URL: https://datacrack.app/solve/126-diabetes-regularized-linear-regression
- Difficulty: medium
- Topic: Regularization for Linear Regression
- Module: Introduction to Machine Learning
Problem Statement
# š§© Diabetes Regularized Linear Regression
---
### šÆ Goal
Apply **regularized linear regression** to a real-world regression task using the Diabetes dataset.
You will choose between **Ridge**, **Lasso**, and **Elastic Net** using a `reg_type` parameter.
---
### š» Task
You need to build a complete regression pipeline on the **Diabetes** dataset.
Steps:
1. Load the Diabetes dataset using `sklearn.datasets.load_diabetes()`.
2. Standardize the full feature matrix using `StandardScaler`.
3. Choose the model based on `reg_type`:
- `"ridge"` ā use `Ridge(alpha=alpha_param)`
- `"lasso"` ā use `Lasso(alpha=alpha_param, max_iter=5000)`
- `"elasticnet"` ā use `ElasticNet(alpha=alpha_param, l1_ratio=0.5, max_iter=5000)`
4. Train the selected model on the full standardized dataset.
5. Standardize the provided `X_test` samples using the same scaler.
6. Predict target values for the provided samples.
7. Return predictions as a list, each rounded to 6 decimal places.
---
### š Background
The Diabetes dataset is a regression dataset.
It contains 442 samples, 10 numeric features, and a continuous target value representing disease progression.
Regularized linear regression adds a penalty to control model weights:
| `reg_type` | Model | Penalty | Main Effect |
| :--- | :--- | :--- | :--- |
| `"ridge"` | Ridge Regression | L2 | Shrinks weights smoothly |
| `"lasso"` | Lasso Regression | L1 | Can push some weights to zero |
| `"elasticnet"` | Elastic Net | L1 + L2 | Combines sparsity and smooth shrinkage |
In scikit-learn, `alpha` controls regularization strength.
- Larger `alpha` means stronger regularization.
- Smaller `alpha` means weaker regularization.
For Elastic Net, use `l1_ratio=0.5`.
This gives a balanced mix between L1 and L2 regularization.
---
### š„ Input / š¤ Output
* **Input:**
* `X_test`: list of samples ā raw, unstandardized features
* `reg_type`: string ā one of `"ridge"`, `"lasso"`, or `"elasticnet"`
* `alpha_param`: float ā regularization strength
* **Output:**
* List of predicted continuous target values, rounded to 6 decimals
---
### ā ļø Important Notes
* Your function handles the entire pipeline internally.
* The `X_test` input represents new raw samples to predict after the model is trained.
* Fit the scaler on the full Diabetes dataset, then use the same scaler to transform `X_test`.
* Do not use train/test split in this problem.
* For Elastic Net, use `l1_ratio=0.5`.
---
### š§© Starter Code
```python
import numpy as np
from sklearn.datasets import load_diabetes
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge, Lasso, ElasticNet
def diabetes_regularized_regression(X_test, reg_type, alpha_param):
"""
Train a regularized linear regression model on the Diabetes dataset.
Args:
X_test: raw test features
reg_type: 'ridge', 'lasso', or 'elasticnet'
alpha_param: regularization strength
Returns:
list: predicted target values
"""
# TODO: Implement the full pipeline
pass
```
---
š” Example
```python
from sklearn.datasets import load_diabetes
data = load_diabetes()
X_test_samples = data.data[:3].tolist()
predictions = diabetes_regularized_regression(X_test_samples, "ridge", 1.0)
print(predictions)
```
Expected Output:
```python
[204.423293, 68.34348, 176.202462]
```
---