I received an answer to a question about learning, and I noticed the writer gave loss functions as $l(p,y)$, where $p$ is a predicted distribution, and $y$ is a class index. https://stats.stackexchange.com/a/589070/96019
Does anyone have a reference where loss functions for classification of any other problem are given this way? The writer convinced me there is a usefulness in the formulation, so I would like to see a reference material to study further.