A NN's inputs can be equal to number of features in data, and have less relationship with number of data points. For example, we can some DNA data, where we only have 1000 people / instances, but each instance has millions of features in DNA. In such setting, it is perfectly OK to build a NN with millions of inputs.
For your question about dimension reduction: the key idea of using NN is letting the model to figure out the feature engineering / necessary transformation. So, the model can automatically do the feature reduction to us. It is not very common to run feature reduction (say PCA) first then feed into NN, unless there are some computational resource constraints.
Also, output is 2K seems too much for me. Are you trying to predict a discrete outcome with 2K possible values? In most cases, people predicting a much less possible values, such as binary Yes/No. (There are some reasons on why it is hard to do to predict 2K possible values, which I will not explain here.)
are you saying that the number of inputs ($7k$ in my case) is not necessarily equal to the number of features one extract from those inputs? Or are you plainly saying that a high number of inputs/features is not an issue whichever the number of data?
I'm forecasting $2k$ products daily sales. For now I'm considering $340$ products (read output) and a corresponding $\sim 1k$ inputs. Forecasting performance minimizing WeightedMape out performs every other algorithm, so I was doubting the suggestion given to me
– Tommaso Guerrini Feb 16 '17 at 16:06