I have the following situation: - ~300 participants, for each of them I have ~30 participant-specific data (from questionnaire) - For each participant I have ~200 points of data consisted of 1 independent variable (reaction time) and 1 dependant/predicted variable (attention).
The goal is, when given a new participant with his ~200 points of data - we would make a prediction for each of his points.
So I clearly want a case-by case prediction, but take specific participant traits into account.
I'm used to working with data "case-wise", where each case has 1 dependent(predicted) variable and a lot of indepndent (predictors). But here I need to predict variation within participant, but also take participant into account.
- I have tried normalizing all response times within participants, and then train model on each case, ignoring variation between participants. No good.
- Another option I considered: just adding all participant-related data to every point within participant, making data overabundant, but flat. Didn't try it yet, but seems more reasonable.
But is there a good or conventional way to treat this kind of data? I believe for some people it is a common case. I mean to somehow explain to the model, that "these 200 data points are for one participants", not just "these 200 data points have the same value for 30 traits"?
Tool-wise: I plan to try Random Forest and Neural Network to see which does better. Doing it all in R.
Could you elaborate more on "predicting vector"? I never predicted anything else but a single value, but the idea makes sense. Can you do this with the same methods (say regressions, random forest or neural network)? Or do I need to look up for specifically adjusted models for vector predictions? The main downside that I imagine, is that vector puts much value in order, and might consider, for example, participants "4th trial" to be more relevant to other participant's "4th trial", instead of "any trial". Is that so?
– Igor Sokolov Oct 31 '19 at 03:28The tricky part is your loss function. Suppose your "irreducible error" is very high for everybody's 100th trial, and very low for everybody's 1st trial. If you optimize MSE, then a neural net might pay too much attention to 100th trials, since that's what's contributing the most to the overall loss.
– goopy Oct 31 '19 at 18:04