2

I'm reading Yehuda Koren's paper: "Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model" SIGKDD 2008.

I notice that in the traditional neighborhood method, say the baseline one:

$\hat{r_{ui}} = \mu + b_u + b_i $

Here the bias of users and items is denoted as two constants:$b_u$ and $b_i$, which could be calculated by averaging all instance of user $u$ and item $i$ respectively.

However, when it comes to the SVD++ model, the author uses $b_{ui}$ instead of $b_u + b_i$, i.e.,

$\hat{r_{ui}} = b_{ui} + q_i^T(p_u + |N(u)|^{-1/2} \sum_{j\in N(u)}y_i )$

I was wondering why uses $b_{ui}$ instead of $b_u + b_i$ ? is that related to the parameter estimation used in matrix factorization-based method?

ice_lin
  • 269

1 Answers1

1

well, this is a sad story that no one answer this question. However, I eventually figured out the answer.

the $b_{ui}$ in SVD++ is just the baseline value generated from baseline model, which is defined aforementioned: $\hat{r_{ui}} = \mu + b_i + b_u := b_{ui} $

It is very easy to confuse it with the bias of user $u$ to item $i$, at least to me, so this might cause some wrong interpretation of the SVD++ model.

ice_lin
  • 269