Background/problem I am trying to solve: I have vehicle timeline data for position, velocity, accelerations for all vehicles on a section of a motorway for ~30min where adjacent data points are spaced ~0.1s in time. I am trying to construct a regression neural network model which can predict one-dimensional vehicle accelerations i.e. vehicles remain in one lane and we are omitting lane changes. For each vehicle in the dataset I am working with, I extract the following feature vector:
[Vehicle speed, Vehicle acceleration, Distance to vehicle ahead, Speed of vehicle ahead, Acc of vehicle ahead].
In the case where there is no preceding vehicle, I set the final 3 features related to the vehicle ahead to an arbitrarily large value of 9999.99. This would essentially indicate a vehicle which is super far away and travelling super fast so we can ignore it and our acceleration output would only depend upon our own dynamics (the first 2 features).
Intuitively, I feel like this is more than enough information to produce reasonable acceleration outputs, thinking about when I am driving myself these are pretty much the only things to consider when deciding whether to brake/accelerate. I am sure there is some sort of non-linear relationship between this feature vector and vehicle acceleration outputs, however, let me know if you disagree with these as being sufficient for this regression problem. As a result, I have decided to use a neural network to learn the non-linear mapping between my feature input and the acceleration as an output.
Issues I am facing: No matter what I try I can't seem to get my neural network to produce "reasonable" outputs. I am expecting something in the form of:
- Negative acceleration outputs when distance to vehicle ahead is low and vehicle speed > speed of vehicle ahead
- Positive acceleration when there is no vehicle ahead i.e. final 3 elements of feature vector are 9999.99
I will attach an example of the kind of outputs I am getting below.
Data preparation: In terms of the data I use for training, I have around 800 000 - 900 000 data points which correspond to around 1 000 different vehicles over the 30min period the data was collected. For the inputs to the NN I extract the 5 features I described above; for the outputs I look around 2.5sec into the future from the time at which a feature vector was extracted and I average the vehicle acceleration across this window. I chose 2.5 sec as it would take around 1-3 sec for a driver's action to manifest in response to a given state they are in (this parameter is not fixed and I can modify it as required). I also smooth the entire dataset slightly using a moving average function since there tends to be fluctuations in the logged velocity and accelerations.
Neural network architecture I am using: I have attached a code of the neural network architecture below. I have kept the neural network relatively small, I assumed the mapping between my inputs and outputs should be relatively straightforward to learn since there should be a strong correlation between the two. I am using Python and PyTorch for all of my Neural Network code.
class NN(nn.Module):
def __init__(self, input_size, output_size):
super(NN, self).__init__() # initialises the constructor in nn.Module
self.fc1 = nn.Linear(input_size, 50)
self.fc2 = nn.Linear(50, 50)
self.fc3 = nn.Linear(50, 50)
self.fc4 = nn.Linear(50, output_size)
def forward(self, x): # where x is the input data
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = self.fc4(x)
return x
Below this I will attach the code which I use to train the model:
# Hyperparameters
learning_rate = 0.0001
num_epochs = 10
batch_size = 1024
Load Inputs and outputs
input_size, output_size = 5, 1
X = np.loadtxt('NN_input_data', dtype=float)
labels = np.transpose([np.loadtxt('NN_output_data', dtype=float)])
X = torch.from_numpy(X).float()
labels = torch.from_numpy(labels).float()
train_data = torch.utils.data.TensorDataset(X, labels)
train_loader = torch.utils.data.DataLoader(dataset=train_data, batch_size=batch_size, shuffle=True)
Build Network
model = neuralnet.NN(input_size=input_size, output_size=output_size)
Loss and Optimiser
Loss_Criterion = nn.MSELoss()
Optimiser = optim.Adam(model.parameters(), lr=learning_rate)
Train Network
print('Training Network:')
num_batches = len(train_loader)
for epoch in range(num_epochs):
t1 = time.time()
for batch_idx, (data, targets) in enumerate(train_loader):
data = data.to(device=device)
targets = targets.to(device=device)
# Forward pass
scores = model(data)
loss = Loss_Criterion(scores, targets)
# Backwards pass
Optimiser.zero_grad() # zero gradients from previous step
loss.backward() # backwards pass
# Gradient descent step
Optimiser.step() # updates weights
# Print statistics
print('Epoch:', epoch, 'Loss:', round(loss.item(), 3), 'Time Taken:', round((time.time()-t1)/60, 3), 'min')
Save Neural Network
torch.save(model.state_dict(), 'NNRegModel_a')
The training loss I obtain from this code is shown below:
Training loss after each epoch
Finally, I will also attach below the outputs that this neural network would produce in simulation in response to a given feature vector below. The initial speeds of the vehicle under consideration and vehicle ahead are 13.4 m/s and 5.069 m/s, respectively. The simulation ran for 2.1 sec in total and terminated with a collision because the neural network outputs where not sufficient to reduce the speed of the vehicle enough to avoid a collision. The simulation logs show that even though the distance to the vehicle ahead is reducing, the acceleration output is not getting larger in magnitude (i.e. to reduce the vehicle speed faster) so the neural network outputs are not accurate enough.
Given the starting speeds of the two vehicles and the distance apart I believe it is not unreasonable for the model to be able to prevent a collision between the two vehicles, such situations (or worse) arise often in real life driving and drivers avoid a collision just fine.
Things I have tried:
I have tried playing around with the learning rate, increasing the epochs, changing the batch size but I get a similar (or worse) response.
I also tried to omit all the data points with a 9999.99 value out of the training step and train my neural network without these extreme points, however, the neural network then produces extremely large acceleration outputs when it sees this data point in simulation.
I have attempted to take the sigmoid of the inputs in order to scale the extreme values of 9999.99 down to a more reasonable scale (the sigmoids I use are adjusted to account for the mean and variance of the corresponding feature). This does not produce a better result than the one shown below.
Finally, I have also tried a couple different activation functions (apart from ReLU, I have tried sigmoid and tanh) which also did not produce an improved result.
Any comments on my data processing, neural network, code or the problem as a whole?