Consecutive differences in Poisson arrivals have an Exponential Distribution. In modelling this kind of data, I have usually seen the arrival rate (lambda) held as constant. Sometimes I have seen Non-Homogeneous approach where arrival lambda can change as a function of time.
In Non-Homogeneous approaches, I have generally seen the arrival rate change as a basic step/staircase function. But I was wondering if it is also possible to have the arrival rate parameter change probabilistically/stochastically according to some process.
For example, perhaps the arrival rate can change according to an Autoregressive Process - or perhaps the arrival rate can fluctuate probabilistically according to a Discrete Markov Chain (eg. two states lambda1, lambda2 - P(lambda1,lambda1), P(lambda1,lambda2), P(lambda2,lambda1), P(lambda2, lambda2)).
I think this might be able to make the models more flexible and realistic since its quite likely that rates might hover stochastically around points instead of uniformly going up or down ... but I am not sure if this allowed (i.e. stochastically changing rate parameter) because it might violate assumptions or complicate the modelling/inference process?
I wrote some R simulations to illustrate what I am talking about:
library(ggplot2)
set.seed(123)
Case 1: AR(1) process
n <- 500 # number of time periods
phi <- 0.9 # AR(1) coefficient
Define the constant rate
lambda <- 5
Simulate the AR(1) process
arrival_rate_ar <- arima.sim(n = n, model = list(ar = phi),
sd = sqrt(lambda*(1-phi^2)))
Ensure all arrival rates are positive a
arrival_rate_ar <- abs(arrival_rate_ar) + lambda
Simulate the arrival data with AR(1) arrival rate
arrival_data_ar <- rpois(n, lambda = arrival_rate_ar)
Case 2: Simulate constant arrival rate
arrival_rate_constant <- rep(lambda, n)
arrival_data_constant <- rpois(n, lambda = arrival_rate_constant)
Case 3: Define the switching rate
lambda1 <- 3
lambda2 <- 8
p <- 0.05
arrival_rate_switch <- rep(lambda1, n)
for(i in 2:n){
if(runif(1) < p){
arrival_rate_switch[i] <- ifelse(arrival_rate_switch[i-1] ==
lambda1, lambda2, lambda1)
} else {
arrival_rate_switch[i] <- arrival_rate_switch[i-1]
}
}
arrival_data_switch <- rpois(n, lambda = arrival_rate_switch)
Create a data frame
df <- data.frame(Time = rep(1:n, 3),
ArrivalRate = c(arrival_rate_ar,
arrival_rate_constant, arrival_rate_switch),
ArrivalData = c(arrival_data_ar,
arrival_data_constant, arrival_data_switch),
RateType = rep(c("AR(1)", "Constant", "Switch"),
each = n))
plots
p1 <- ggplot(df[df$RateType == "AR(1)",], aes(x = Time,
y = ArrivalData)) +
geom_line() +
ggtitle("Arrival Data (AR(1) Rate)") +
xlab("Time") +
ylab("Number of Arrivals") + theme_bw()
p2 <- ggplot(df[df$RateType == "AR(1)",], aes(x = Time,
y = ArrivalRate)) +
geom_line() +
ggtitle("Arrival Rate (AR(1))") +
xlab("Time") +
ylab("Rate") + theme_bw()
p3 <- ggplot(df[df$RateType == "Constant",], aes(x = Time,
y = ArrivalData)) +
geom_line() +
ggtitle("Arrival Data (Constant Rate)") +
xlab("Time") +
ylab("Number of Arrivals") + theme_bw()
p4 <- ggplot(df[df$RateType == "Constant",],
aes(x = Time, y = ArrivalRate)) +
geom_line() +
ggtitle("Arrival Rate (Constant)") +
xlab("Time") +
ylab("Rate") + theme_bw()
p5 <- ggplot(df[df$RateType == "Switch",],
aes(x = Time, y = ArrivalData)) +
geom_line() +
ggtitle("Arrival Data (Switching Rate)") +
xlab("Time") +
ylab("Number of Arrivals") + theme_bw()
p6 <- ggplot(df[df$RateType == "Switch",],
aes(x = Time, y = ArrivalRate)) +
geom_line() +
ggtitle("Arrival Rate (Switching)") +
xlab("Time") +
ylab("Rate") + theme_bw()
- Is the approach I described mathematically logical?
- Is this kind of approach popular in statistics (ie suppose we observe data and want to fit models based on these approaches to this data)?
- Do people ever use these kinds of approaches or is it unnecessarily complicated and mathematically incorrect?
- Or perhaps (due to the stochastic nature of the models) the approaches I described would result in parameter estimates with large variances/unbiased/not consistent/not asymptotically normal?
Would be interested to hear opinions on this. The closest thing I could find to approach I described was:
- "Doubly Stochastic Processes"
- Coxian Process
- the financial Heston Model (ie Black-Scholes where variance is now a stochastic time parameter)
- a combination of a Poisson Thinning Process and Compounding Poisson Process?
