Softmax Function Practice Problem
This data science coding problem helps you practice Logistic Regression, softmax function, and implementation skills. Read the problem statement, write your solution, and strengthen your understanding of Logistic Regression.
- Problem ID: 16
- Problem key: 16-softmax-function
- URL: https://datacrack.app/solve/16-softmax-function
- Difficulty: easy
- Topic: Logistic Regression
- Module: Introduction to Machine Learning
Problem Statement
# 🧩 Softmax Function
---
### 🎯 Goal
Implement the **Softmax function**, which generalizes the **sigmoid function** to handle **multiclass classification** problems.
This will help you understand how model outputs can be converted into **probabilities that sum to 1**, representing class likelihoods.
---
### 🔍 Explanation of Symbols
| Symbol | Meaning | Shape / Type |
|:------:|:--------|:-------------|
| **$z_i$** | Raw score (logit) for class *i* | scalar |
| **$\sigma(z_i)$** | Softmax probability for class *i* | scalar |
| **$z$** | Vector of logits (input values) | $(K,)$ where $K$ = number of classes |
---
### 🧮 Background / Intuition
In **binary classification**, we used the **sigmoid function** to map real numbers into probabilities between 0 and 1.
However, when we deal with **multiple classes**, we need a function that:
1️⃣ Converts all raw scores (logits) into **positive probabilities**, and
2️⃣ Ensures those probabilities **sum to 1** across all classes.
---
#### 🔹 What are *logits*?
Before applying Softmax, a model (like logistic regression or a neural network) usually outputs **raw scores**, called **logits**.
These logits can be any real numbers — positive or negative — and they represent the model’s *confidence strength* for each class, **not probabilities** yet. Softmax transforms these logits into **probabilities** that can be compared directly.
The **Softmax function** is defined as:
$$
\sigma(z_i) = \frac{e^{z_i}}{\sum_{j=1}^{K} e^{z_j}}
$$
This means each score is exponentiated and normalized by the sum of all exponentials, producing a valid probability distribution.
It can be viewed as an **extension of the sigmoid function** — while sigmoid handles one output (two classes), softmax handles many.
---
#### 🔹 Example Intuition
If a model outputs logits `[2.0, 1.0, 0.1]`,
after applying Softmax we get probabilities `[0.659, 0.242, 0.099]`.
This means:
* Class 0 has ≈ 65.9% probability (most likely),
* Class 1 has ≈ 24.2% probability,
* Class 2 has ≈ 9.9% probability.
The class with the **highest Softmax probability** is the model’s **predicted class**.
---
### 📥 Input / 📤 Output
**Input:**
- `z` (`list[float]`): A list of real-valued numbers representing class logits.
**Output:**
- `list[float]`: The corresponding softmax probabilities that sum to 1.
---
### 💻 Task Description
Implement a function to compute the **Softmax function** for a given list of values.
Your implementation should:
- Compute exponentials of all input elements.
- Normalize them by the sum of all exponentials.
- Return the resulting probabilities as a list of floats.
---
### 🧩 Starter Code
```python
import numpy as np
def softmax(z):
"""
Compute the Softmax probabilities for a list of logits.
Args:
z (list[float]): List of real-valued numbers (logits).
Returns:
list[float]: Softmax probabilities that sum to 1.
"""
# TODO: Implement softmax function
pass
```
---
### 💡 Example
```python
# Example
z = [2.0, 1.0, 0.1]
print(softmax(z))
```
Expected Output:
```
[0.659, 0.242, 0.099]
```