Questions tagged [predictive-models]

Predictive models are statistical models whose primary purpose is to predict other observations of a system optimally, as opposed to models whose purpose is to test a particular hypothesis or explain a phenomenon mechanistically. As such, predictive models place less emphasis on interpretability and more emphasis on performance.

Wikipedia has articles https://en.wikipedia.org/wiki/Predictive_modelling and https://en.wikipedia.org/wiki/Predictive_analytics with further references.

3152 questions
78
votes
16 answers

Practical thoughts on explanatory vs. predictive modeling

Back in April, I attended a talk at the UMD (University of Maryland) Math Department Statistics group seminar series called "To Explain or To Predict?". The talk was given by Prof. Galit Shmueli who teaches at UMD's Smith Business School. Her talk…
wahalulu
  • 171
14
votes
4 answers

AUC for someone with no stats knowledge

Can someone explain what area under the curve means for someone with absolutely no stats knowledge? For example, if a model claims an AUC of 0.9, does that mean that it makes an accurate prediction 90% of the time?
Forest
  • 243
11
votes
3 answers

How do bookmakers select their opening odds?

I know that bookmakers adjust their odds in order to maximize the gain, forecasting the probabilities of the volume of money placed in every outcome. How do bookmakers select their opening odds?
emanuele
  • 2,098
10
votes
3 answers

Statistics for online dating sites

I'm curious how an online dating systems might use survey data to determine matches. Suppose they have outcome data from past matches (e.g., 1 = happily married, 0 = no 2nd date). Next, let's suppose they had 2 preference questions, "How much do…
d_a_c321
  • 1,259
8
votes
2 answers

Kappa for Predictive Model

The "standard" way to compute Kappa for a predictive classification model (Witten and Frank page 163) is to construct the random confusion matrix in such a way that the number of predictions for each class is the same as the model predicted. For a…
B_Miner
  • 8,630
7
votes
1 answer

How to verify that the model is real?

Here is a block diagram which I'm using when I want to verify that my model is real. In each round a fold of 11/12 percent of the data is used to bulid the model (e.g. eigenvectors of the PCA) After 12 rounds I check that the models (e.g the…
Dov
  • 1,810
5
votes
1 answer

Building a customer cancellation predictor

I'm trying to build a cancellation predictor for telecom data. I am using both static (i.e. location, device, number of complaints, etc.) and temporal (i.e. time-series usage) data. The response variable is whether or not they cancelled within the…
user1893354
  • 1,875
  • 4
  • 18
  • 27
5
votes
2 answers

How to test the predictive power of a model?

I want to build a model to predict the outcomes of experiments. My predictive model gives out scores with an range 1 to 100 values. I want to test if my predictive scores can be used to classify experimental outcomes as "good" or "bad"…
evdstat
  • 191
4
votes
2 answers

How do I combine two predictors?

I am trying to classify a data set with 2 boolean values. I have two classifiers that may/may not be independent. The first one is 65% accurate, and the second one is 60% accurate. Can I combine the two somehow to get a better classifier? Or will…
4
votes
1 answer

How to correct for correlation at baseline between predictor and "DV"?

So say I am testing the predictive value of a predictor towards a "DV" within a survey data-set, between two time-points. Say also that I have measurements of both variables at all time points and that I can see that the DV and the predictor…
4
votes
3 answers

Evaluating loan applications for accept/reject

I am working on the problem of loan application acceptance/rejection. I have historical data of about 500K applications and about 70K loans that got funded out of these applications for various loan products and their performance histories. I want…
arun
  • 390
3
votes
1 answer

Building a supervised model based on constantly updating covariates?

I have a classification model that categorises customers risk to a lending company over time. For example, a customer may appear as credit-worthy but with time, new updated data we collect may indicate the contrary. More specifically, I want to…
3
votes
2 answers

Predicting Y based on distribution of X

Suppose I have two random variables Y and X, where Y is given as one point while X is given as a distribution. I am trying to predict Y based on X, however I cannot put the whole distribution of X in a column as I do not have one value. I could…
user2974951
  • 7,813
3
votes
0 answers

Can the variable that is responsible for bad performance of a predictive model be identified?

In the context of pharmacokinetics, we try to predict an experimental variable $E$ (obtained from an expensive in vivo study that can only be run on few molecules), by a theoretical model that uses (much cheaper to obtain) experimental variables $u$…
3
votes
0 answers

Using a time series as covariate in regression model

I have a binary outcome variable (Disease/No disease). In a diagnostic test for this disease, 20 different sensors record a time series value. These time series are relatively correlated, but each sensor measures a different thing. In order to…
skip
  • 31
  • 2
1
2 3 4 5