Continuous logit framework

Question

I am reading Dupuy & Galichon (2014), which extend the estimation of matching model in Choo & Siow (2006) to continuous types. The way they build the continuous logit model is based on the insights of Cosslett (1988) and Dagsvik (1994) which "have independently suggested using max-stable processes to model continuous choice."

The detail of the continuous logit model model is described in the Appendix A:

"In this paragraph, we expound the main ideas of Cosslett (1988) and Dagsvik (1994), who show how to obtain a continuous version of the multinomial logit model. Assume that $\lbrace\left(y_k^m, \varepsilon_k^m\right), k \in \mathbb{N} \rbrace$ are the points of a Poisson point process on $\mathcal{Y} \times \mathbb{R}$ of intensity $d y \times e^{-\varepsilon} d \varepsilon$. We recall that this implies that for $S$ a subset of $\mathcal{Y} \times \mathbb{R}$, the probability that man $m$ has no acquaintance in set $S$ is $\exp \left(-\int_S e^{-\varepsilon} d y d \varepsilon\right)$. From (2), man $m$ chooses woman $k$ among his acquaintances such that his utility is maximized; that is, man $m$ solves $$\max_k \lbrace U\left(x,y_k^m\right)+\varepsilon_k^m \rbrace$$.

Letting $Z$ be the value of this maximum, one has for any $c \in \mathbb{R}$ $$ \operatorname{Pr}(Z \leq c)=\operatorname{Pr}\left(U\left(x, y_k^m\right)+\varepsilon_k^m \leq c \forall k\right), $$ which is exactly the probability that the Poisson point process $\left(y_k, \varepsilon_k^m\right)$ has no point in $\{(y, \varepsilon): U(x, y)+\varepsilon>c\}$; thus $$ \begin{aligned} \log \operatorname{Pr}(Z \leq c) & =-\iint_{\mathcal{Y} \times \mathbb{R}} 1(U(x, y)+\varepsilon>c) d y e^{-\varepsilon} d \varepsilon \\ & =-\int_{\mathcal{Y}} \int_{c-U(x, y)} e^{-\varepsilon} d \varepsilon d y \\ & =-\int_{\mathcal{Y}} e^{-c+U(x, y)} d y \\ & =-\exp \left[-c+\log \int_{\mathcal{Y}} \exp U(x, y) d y\right] \end{aligned} $$

Hence $Z$ is a $\left(\log \int_{\mathcal{Y}} \exp U(x, y) d y, 1\right)$ Gumbel. In particular, $\mathbb{E}\left[\max_k \lbrace U\left(x, y_k^m\right)+\varepsilon_k^m \rbrace \right]=\log \int_{\mathcal{y}} \exp U(x, y) d y,$

and the choice probabilities are given by their density with respect to the Lebesgue measure

$$\pi(y \mid x)=\exp [U(x, y)] /\left[\int_{\mathcal{Y}} \exp U\left(x, y^{\prime}\right) d y^{\prime}\right] .$$

The same logic also implies that $\lbrace \varepsilon_k: k \in \mathbb{N} \rbrace$ has a Gumbel distribution. Indeed, the probability that this Poisson point process has no element in the set $\{\varepsilon: \varepsilon>c\}$ is equal to $$ \exp \left(-\int_c^{+\infty} e^{-\varepsilon} d \varepsilon\right)=\exp [-\exp (-c)] $$ which is equivalent to saying that $\operatorname{Pr}\left(\max _{k \in \mathbb{N}} \varepsilon_k \leq c\right)=\exp [-\exp (-c)]$. Finally, note that a similar argument would show that $m$ has almost surely an infinite, though countable, number of acquaintances, as announced. "

///

I think I fully understand the derivation towards "$Z$ is a Gumbel". But then I stuck on deriving the perhaps most important equation of logit model: $$ \pi(y \mid x)=\exp [U(x, y)] /\left[\int_{\mathcal{Y}} \exp U\left(x, y^{\prime}\right) d y^{\prime}\right]$$ . I don't see how it comes from the previous derivation.

///

I even checked one of the paper cited, Dagsvik 1994, and found in its appendix (PROOF OF THEOREM4) there is a similar derivation (A.8 to A.9) but again without any further explanation. In case anyone is interested, the equations there are "(A.8) $$ \begin{gathered} P\left(\sup _{T(z) \in A,(T(z), E(z)) \in H, z \in Z}(\hat{v}(\hat{p}(T(z)), T(z), K)+E(z)) \leqslant y\right) \\ = \begin{cases}\exp \left\{-e^{-y} \mu \int_A \exp (\hat{v}(\hat{p}(t), t, K)) G(d t)\right\} & \text { for } y \geqslant c, \\ 0 & \text { for } y<c .\end{cases} \end{gathered} $$

From (A.8) we get (A.9) $$ \begin{gathered} P\left(\sup _{T(z) \in A,(T(z), E(z)) \in H, z \in Z}(\hat{v}(\hat{p}(T(z)), T(z), K)+E(z))\right. \\ \left.>\sup _{T(z) \in D-A,(T(z), E(z)) \in H, z \in Z}(\hat{v}(\hat{p}(T(z)), T(z), K)+E(z))\right) \\ \quad=\frac{\int_{u \leqslant t, u \in D} \exp (\hat{v}(\hat{p}(u), u, K)) G(d u)}{\int_D \exp (\hat{v}(\hat{p}(u), u, K)) G(d u)} \cdot\left(1-\exp \left(-\tilde{\Lambda}_c\right)\right), \end{gathered} $$ where $$ \tilde{\Lambda}_c \equiv \mu e^{-c} \int_D \exp (\hat{v}(\hat{p}(t), t, K)) G(d t) . $$

Since $\Lambda_c$ is the expected number of Poisson points in $H \cap(D \times R)$ the probability that $H \cap(D \times R)$ is nonempty equals $$ 1-\exp \left(-\tilde{\Lambda}_c\right) \text {. } $$"

Alalalalaki · Answer 1 · 2024-02-14T17:16:23.233

This derivation follows the same procedure as the one used in discrete logit model.

Let's denote the index of the maximum given $y$ is selected is $(y_i^m,\varepsilon_i^m)$, where $y_i^m = y$. We have $$\operatorname{Pr}_i = \operatorname{Pr}\left(U\left(x, y_k^m\right)+\varepsilon_k^m \leq U\left(x, y_i^m\right)+\varepsilon_i^m \ \forall k \neq i \right) \\ = \exp \left( -\exp \left[-U\left(x, y_i^m\right)-\varepsilon_i^m+\log \int_{\mathcal{Y}} \exp U(x, y) d y\right] \right)$$.

Then we can write the choice probabilities as $$\pi(y \mid x) = \int \operatorname{Pr}_i e^{-\varepsilon_i} d \varepsilon_i =\int \exp \left( -\exp \left[-U\left(x, y \right)-\varepsilon_i +\log \int_{\mathcal{Y}} \exp U(x, y') d y'\right] \right) e^{-\varepsilon_i} d \varepsilon_i \\ = \int \exp \left( -\exp \left[-\varepsilon_i\right] \exp \left[-U\left(x, y \right)\right] \int_{\mathcal{Y}} \exp U(x, y') d y' \right) e^{-\varepsilon_i} d \varepsilon_i \\ = \int \exp \left( -\exp \left[-\varepsilon_i\right] \int_{\mathcal{Y}} \exp \left[ U(x, y')-U\left(x, y \right) \right] d y \right) e^{-\varepsilon_i} d \varepsilon_i$$

Then define $t=\exp(-\varepsilon_i)$ such that $-\exp (-\varepsilon_i) d \varepsilon_i=d t$ , we have $$\pi(y \mid x) = \int_{\infty}^0 \exp \left( - t \int_{\mathcal{Y}} \exp \left[ U(x, y')-U\left(x, y \right) \right] d y' \right) (-dt) \\ = \int_0^{\infty} \exp \left( - t \int_{\mathcal{Y}} \exp \left[U(x, y')-U\left(x, y \right) \right]d y' \right) dt \\ = \frac{\exp \left( - t \int_{\mathcal{Y}} \exp \left[U(x, y')-U\left(x, y \right) \right] d y' \right)}{ - \int_{\mathcal{Y}} \exp \left[U(x, y')-U\left(x, y \right) \right] d y' } \left.\right|_0 ^{\infty} \\ = \frac{1}{ \int_{\mathcal{Y}} \exp \left[U(x, y')-U\left(x, y \right) \right] d y' } =\frac{\exp U\left(x, y \right)}{ \int_{\mathcal{Y}} \exp U(x, y') d y' }$$

Any comments are welcome.

score 2 · Answer 2 · answered Feb 14 '24 at 07:06

2

As they state, $\pi(y|x)$ is the density of $\ln\left(\int_Y \exp(U(x,y) dy\right)$ (the density being the derivative of the cdf).

If we take the derivative of $\ln\left(\int_Y \exp(U(x,y')) dy' \right)$ with respect to $y$, we obtain:

$$ \pi(y|x) = \frac{\exp(U(x,y))}{\int_Y \exp(U(x,y') dy'} $$ The denominator follows from the $\ln$ operator. The numerator follows from the chain rule given that $\dfrac{d \left(\int_Y exp(U(x,y')) dy'\right)}{dy} = \exp(U(x,y))$.

answered Feb 14 '24 at 07:06

tdm

11,747
9
36

1

Thanks very much for your answer as always. The derivation is clear but can you explain why $\log \int_Y \exp U(x, y) d y$ is the cdf? And their statement is "choice probabilities are given by their density with respect to the Lebesgue measure" but not the choice probabilities are the density function itself? – Alalalalaki Feb 14 '24 at 12:50

Continuous logit framework

2 Answers2