1

I know linear regression is on a continuous response variable and logistic regression is on a binary response variable.

But is there any name for regression on response variable between 0 to 1? Is this approach, where set the objective function to $\text{minimize}~ \|\frac 1 {1+e^{-X\beta}} -y \|_2^2$ valid? and had a name?

Haitao Du
  • 36,852
  • 25
  • 145
  • 242
  • It seems like this would just be regular linear regression, right? If it is allowed to be continuous, but just bounded between 0 and 1, it's just regression. You could normalize any continuous DV to be bounded at 0 and 1. The scale affects the size of the unstandardized regression coefficients, but not the actual process of doing the regression. – Mark White Jul 28 '17 at 15:14
  • @MarkWhite how about this? – Haitao Du Jul 28 '17 at 15:15
  • @MarkWhite Standardizing the response variable (not predictors) is still a regression problem, but the model would not be linear. – Kevin Jul 28 '17 at 15:31
  • Beta regression can be used for a response variable bound by 0 and 1. – Sal Mangiafico Jul 28 '17 at 15:32
  • $y$ isn't a linear function of $x$; the function looks like a sheet bent into a sigmoid shape along one direction in input space (with the direction and steepness defined by the weights) – user20160 Jul 28 '17 at 15:45
  • 2
    Your question relies on mistaken premises - "regression on a continuous variable" is not limited to linear regression, and similarly "regression on a 0/1 variable" is not limited to logistic regression (they're examples of, not names for those things); similarly there's more than one regression model for continuous data on (0,1)... – Glen_b Jul 29 '17 at 08:48
  • I made another comment about this in Marcio's answer, but it's important to note whether $y \in (0,1)$ or $y \in [0,1]$. Your question suggests you are looking at $y \in (0,1)$, but in practice, I think $y \in [0,1]$ is much more common. – Cliff AB Jan 08 '18 at 18:20
  • See https://stats.stackexchange.com/questions/216122/what-is-the-difference-between-logistic-regression-and-fractional-response-regre – kjetil b halvorsen Aug 26 '23 at 13:51

3 Answers3

4

If the response variable is between 0 to 1, then you could model using a Beta Regression. The seminal paper is

Ferrari, S.L.P., and Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815.

There is also a R-package available called 'betareg'. An example from the documentation:

library(betareg)
data("GasolineYield")
gy <- betareg(yield ~ batch + temp, data = GasolineYield)
summary(gy)
  • 3
    Important note: if I recall correctly, the methods in this paper require that $y \in (0,1)$, not $y \in [0,1]$. This is of high consequence if you $y$ is something like the estimated probability from a binomial distribution, which has positive probability of being 0 or 1. – Cliff AB Jan 08 '18 at 18:19
2

Not sure what the data is you're trying to model, but another option is to use some sort of transformation on your response using a monovariate approach. Something like

$$ y = log(\frac{x - a}{b - x}) $$

where a = lower limit and b = upper limit

This is used a lot in forecasting I believe (check out the article by the always excellent Rob Hyndman: https://robjhyndman.com/hyndsight/forecasting-within-limits/)

Aus_10
  • 121
2

I believe the generic term is fractional response regression. There are logit, probit, and heteroskedastic probit conditional mean versions.

The standard reference for the logit case is:

Papke, L. E., and J. M. Wooldridge. 1996. "Econometric methods for fractional response variables with an application to 401(k) plan participation rates." Journal of Applied Econometrics 11: 619–632.

dimitriy
  • 35,430