I have extracted the optical flow along x and y axes. I want to pass them into a ConvNet. The thing I cannot understand is whether these should be two different input channels or should I combine them in some way, like stacking them, adding them or averaging them
Paper- Two-Stream Convolutional Networks for Action Recognition in Videos