3

I have a set of events of type 1, and their start and end times. And a set of events of type 2, and their start and end times. I'm struggling to wrap my head around how I can test whether these two event types are occurring independently of each other, or whether one tends to follow the other (in terms of the times at which it happens). I can think of some fairly simplistic ways of looking into it (e.g. histogramming the time between an event of type 1 and the next one of type 2). But I wondered if this might fit into some branch of statistics that I'm not familiar with, that someone could point me towards?

EDIT

A little more info: this relates to data from a group of animals feeding from a machine that has space only for one animal at a time. (So event type 1 is animal 1 going to feed, and event type 2 is animal 2 going to feed, etc). So lack of independence would probably be seen by one animal feeding fairly shortly after the other. And it is not possible for them to feed concurrently. The real question is whether there is some kind of social structure to these young animals' feeding behaviour, or whether they just go to feed when they feel hungry. (There are also more than two animals in the pen, but I had been trying to keep my question as simple as possible. Realising now, that the extra detail is probably quite important! There are seven animals in the pen, so it is manageable to look at it pairwise if that's easier.)

justme
  • 775
  • Do you have any specific theories about how and why one event would follow another and about how much time ought to elapse? Such information can be helpful in formulating a model as well as proposing appropriate statistics to measure lack of independence. – whuber Jun 29 '22 at 16:06
  • Can two or more events of the same type happen concurrently? – user4422 Jun 29 '22 at 21:46
  • @whuber -- thank you for the question, which I probably should have addressed in the initial question with a little more detail. I've edited the question to add that info. In brief: non-independence would probably show up as one event tending to follow the other in a fairly short space of time (but a little more info in the question now) – justme Jun 30 '22 at 07:46
  • @user4422 -- thank you for the question, which I probably should have addressed in the initial question. I've edited the question to add that info. No they cannot be concurrent. – justme Jun 30 '22 at 07:47
  • What kinds of exploratory graphics and summaries have you produced? They could be useful in helping readers understand the data and the problem. If you haven't considered such exploration, that should be your very next step and you might want to solicit suggestions about what graphics would be useful and effective. – whuber Jun 30 '22 at 12:01
  • @whuber Of course, great advice, and I have tried to explore it in a number of ways. For example, I plotted a grid of histograms of time-between end-of-feeding of animal_i and start of feeding of animal_j. There were some interesting patterns in there, certain pairs of animals seeming to have little spikes at short time periods (as part of a broader histogram). The problem for me is that I don't know how I can tell if that is just a natural result of noise (and hence the interest in modelling it). I've also used clustering to try to find a natural cutoff to cluster feedings into group... – justme Jun 30 '22 at 14:42
  • ...meals and looked at which animals seem to co-eat within these meals. But again, it's hard to know how strongly to interpret this without some kind of model that understands how much noise we would anyway expect... – justme Jun 30 '22 at 14:43

1 Answers1

2

The feeding machine has three possible states:

  1. idle;
  2. occupied by type A;
  3. occupied by type B.

If we discretize time, then we can model our problem with a Markov chain with 3 possible states.

We need to estimate a 3x3 transition probability matrix whose entries are the probabilities of switching from one of the 3 states to another one.

The estimation of time-invariant transition probabilities can be done as described in this question.

However, in our case we probably need to make the transition probabilities time-varying and dependent on some observables (e.g., a dummy that tells us whether the machine has been occupied in the last x minutes by A or B). The technology to estimate these models is described in the time-series textbook by James Hamilton and in several other places (e.g., here).

You can then use the estimated probabilities and the impacts of the observables on the probabilities to answer many interesting questions.

A simpler alternative would be to train a classification model (having as output the probability distribution of the next state and as input the one-hot encodings of the current state, plus other predictors such as statistics about the occupancy of the feeding machine in the previous minutes).

user4422
  • 1,178
  • Thanks this is excellent. I had counted out Markov Models on account of the Markov property, but I didn't know about time-invariance transition probabilities. Though, thinking about it, with your last suggestion, I wonder if the classification predictors "current state + statistics about occupancy" could be replaced by a single pair of "time since occupancy of A, time since occupancy of B" predictors? – justme Jun 30 '22 at 09:07
  • (Oh -- or would I have to put in the "current state" predictors in the same way you need to include lagged predictors in other time series models?) – justme Jun 30 '22 at 09:08
  • I guess you are mostly interested in the transition from idle to occupied. So you need to condition on idle. Hence you need the one hot encoding of the current state in the set of predictors. – user4422 Jun 30 '22 at 09:36