Softmax
Advantage: Map the original output vector to a space of [0,1] Key word: Activation Function Usage: Normalize the output of a network to a probability distribution over predicted output classes. Used in multiclass classification problems.
- Form
This is a fully-connected layer. Output would be a probability of a category.
The reason to use it instead of sigmoid is that sigmoid would have a higher probability to classify the problem to either end.
上篇AI算法