Finding the best weights for sparse categorical cross entropy loss

Question

In semantic segmentation and similar applications, sparse categorical cross entropy is often used as a loss function. Now it usually happens that samples are imbalanced. In my case, I have one class that accounts for about 60% of the total image areas and the remaining 30 classes share the rest.

As shown in this tutorial, tf.keras.losses.SparseCategoricalCrossentropy takes not only X and y (images and masks/labels) but also weights. Now I wondered how to find the most suitable weights?

So far, I just searched through all the masks in the training set and created an array with the total pixel counts for each class. Then I just calculated weights $w_i$ by the inverse pixel count as $w_i = \frac{\text{max}(\text{count})}{\text{count}_i}$ for each class.

Let's say we have images with $n$ unequally frequent classes $c$ and a model that simply classifies every encountered pixel as class $c_i$. Then I would like to have weights that make this behaviour equally costly no matter the class $c_i$; i.e., $\text{loss}_i = x\; \forall\; i \in \{1,...,n\}$.

Does this make sense? And is it possible to implement?

On the internet I found a formula for catrgorical cross entropy (I am not sure if it is the same for sparse categorical cross entropy, though):

$\text{cce} = - \frac{1}{N} \sum\limits_{i=1}^N\sum\limits_{c=1}^C1_{y_i}\in C_c \log{p_\text{model}}\left[y_i \in C_c \right]$

Maybe someone has the mathematical knowledge to find a solution to this problem (or to tell me where I am wrong).

Why is the class imbalance a problem for your work? Statisticians tend not to see class imbalance as such an issue. — Dave, Jul 02 '21 at 14:45
@Dave It is an issue because the model is supposed to reduce the loss function. Here, it is highlighted what the problems with accuracy are, when you have unbalanced classes. A non-weighted categorical cross entropy loss function will similarly lead to a model that only predicts the most common class. It's the cheapest solution but it makes the predictions completely useless. — Manuel Popp, Jul 02 '21 at 15:00
Accuracy is a problem when there is perfect class balance, too. — Dave, Jul 02 '21 at 15:08

Finding the best weights for sparse categorical cross entropy loss

0 Answers0