本人是从事交叉学科的机器学习小白研究生。我遇到的问题是,通过神经网络建立6个输入和1个输出的关系,训练集共有3000组数据。但是不管如何调参,在Keras里神经网络拟合效果始终不如Scikit learn。
下面是Keras 和Scikit learn神经网络框架、参数和训练效果,不知道在Keras神经网络上有什么可提高的方向?
model_1=keras.Sequential(
[
layers.Dense(20,
activation="relu",
input_shape=(6,),name="layer1",
kernel_initializer=initializers.RandomNormal(stddev=0.01),
bias_initializer=initializers.Zeros(),
),
layers.Dropout(0.3, noise_shape=None, seed=None),
layers.Dense(20,
activation="relu",
name="layer2",
kernel_initializer=initializers.RandomNormal(stddev=0.1),
bias_initializer=initializers.Zeros(),
),
layers.Dropout(0.3, noise_shape=None, seed=None),
layers.Dense(20,
activation="relu",
name="layer3",
kernel_initializer=initializers.RandomNormal(stddev=0.1),
bias_initializer=initializers.Zeros(),
),
layers.Dropout(0.3, noise_shape=None, seed=None),
layers.Dense(20,
activation="relu",
name="layer4",
kernel_initializer=initializers.RandomNormal(stddev=0.1),
bias_initializer=initializers.Zeros(),
# kernel_regularizer=keras.regularizers.l2(0.0001),
),
layers.Dropout(0.3, noise_shape=None, seed=None),
layers.Dense(20,
activation="relu",
name="layer5",
kernel_initializer=initializers.RandomNormal(stddev=0.01),
bias_initializer=initializers.Zeros(),
),
layers.Dense(1,
name="lastlayer",
kernel_initializer=initializers.RandomNormal(stddev=0.01),
bias_initializer=initializers.Zeros(),
),
]
)
model_1.compile(
optimizer=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-8,clipvalue=1),
loss='MSE',
)
history_1=model_1.fit(x_scaled_2, y_scaled_2,
batch_size=64,
validation_split=0.2,
epochs=4000,
verbose=2,
)
model = MLPRegressor(activation='relu', alpha=0.0001,batch_size='auto', beta_1=0.9, beta_2=0.999,
early_stopping=False, epsilon=1e-08,hidden_layer_sizes=(20,60,60,60,60,20),
learning_rate='constant',learning_rate_init=0.01, max_iter=50000,
momentum=0.9, n_iter_no_change=100,nesterovs_momentum=True, power_t=0.5,
random_state=1, shuffle=True, solver='lbfgs',
tol=0.0001, validation_fraction=0.2, verbose=True, warm_start=False)
1.你用了dropout,在做inference的时候,需要用keras.backend.set_learning_phase(0)将模型设置为test模式。
2.keras模型和scikit的MLPRegressor设置的初始学习率不一样,请保持一致再对比。
3.keras可以用LearningRateScheduler动态地调整学习率,随着epochs的增加去减少学习率,这样就会在模型训练进入plateau时继续涨点。示例如下:https://tensorflow.google.cn/api_docs/python/tf/keras/callbacks/LearningRateScheduler?hl=en#example
4.kernel_initializer可以用默认的glorot_uniform
5.scikit的MLPRegressor你选择了使用l-bfgs这种二阶优化算法,虽然会有使用限制但是一般二阶优化算法收敛速度回更快,如果你用的是tf2中的keras的话可以用一用TensorFlow Probability提供的tfp.optimizer.lbfgs_minimize优化器,或者把MLPRegressor优化器换成adam再进行对比
6.网络模型大小不一样,请保持scikit的MLPRegressor的hidden_layer_sizes与tf一致再进对比
只有三千组数据啊,层数需要这么深吗? 你这个效果是测试集上的效果还是训练集的,分别对比一下两个效果