using DNN to find out the pdf of a regression problem

Question

When we use deep neural networks (DNNs) to solve a 1-dimention regression problem, we can approximate data distribution with the output of a DNN like the picture below.
My question is that DNN does not have the assumption of gaussian distribution or any other distribution of itself. It just knows what value to output when it sees an input. So how do you know the probability distribution of the DNN? For example, if someone asks, what is the probability of the point appearing in (5, 0). Can DNN answer this kind of questions?

(pic from https://medium.com/@sunnerli/dnn-regression-in-tensorflow-16cc22cdd577)

score 2 · Accepted Answer · answered Jan 30 '18 at 03:48

2

For many regression algorithms, not only neural networks, the model is that the data is distributed by $y \sim \mathcal{N}(f(x;\theta), \sigma^2)$, where $\theta$ are the model parameters and $\sigma^2$ is the variance of the distribution (often a hyperparameter).

Maximizing the log-likelihood of the data with respect to $\theta$ is equivalent to minimizing the mean squared error loss between the $y_i$ and $f(x_i;\theta)$.

Therefore, to compute the probability density of $(5,0)$, you would just find the density of a gaussian with mean $f(5; \theta)$ and a variance of $\sigma^2$, where $f$ is your neural network.

answered Jan 30 '18 at 03:48

shimao

26,092

Thanks for your answer. But I still have two questions. 1: Does apply DNN to a regression problem also must have the assumption that the data has to be gaussion distributed? As I know, we just care about mean square error (MSE) of difference of output value and ground truth value. There is no gaussian distribution involved. 2. How do I find out a DNN's multiple means and variances from its weights and biases only? Is this even possible? – Lion Lai Jan 30 '18 at 04:09
Using MSE (L2) loss corresponds to the data being distributed normally. Using L1 loss corresponds to data being distributed according to the laplacian distribution. In general, there is a mapping between loss functions and probability distributions. 2. Not sure what you mean by a DNN's multiple means and variances.

shimao

Jan 30 '18 at 04:11

Are you refering to regularization? Can you add references of them? 2. After traing a DNN model, all we have are network's weights and biases. How can I calculate the density funtion from these numbers. Thank you.

– Lion Lai Jan 30 '18 at 04:18

using DNN to find out the pdf of a regression problem

1 Answers1