Here is the dataset. I tried converting this implementation into its analog with Keras. Why are my predictions SO bad? They are almost always close to a single number. Doesn't matter if I use more layers, more neurons, or even if I use convolutional NN. Am I doing something quite wrong? I used the same dataset to train the network in the second link and had good results.
data1_file_path = 'GSE106648_data1.csv'
data2_file_path = 'GSE106648_data2.csv'
#read in training data
train = np.loadtxt(data1_file_path, skiprows=1, delimiter=',')
print("Finish read training set")
#read in test data
test = np.loadtxt(data2_file_path, skiprows=1, delimiter=',')
print("Finish read test set")
#separate training/testing input features and labels
x_train = train[:,1:]
y_train = train[:,0].reshape(-1,1)
x_test = test[:,1:]
y_test = test[:,0].reshape(-1,1)
define base model
def baseline_model():
# create model
model = Sequential()
model.add(Dense(200, input_shape=(x_train.shape[1],),activation='LeakyReLU'))
model.add(Dense(1))
# Compile model
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001))
return model
baseline_model().fit(x_train,y_train,batch_size=20,epochs=500)
y_pred = baseline_model().predict(x_test)
These are the predictions:
And the loss function vs epochs.

