train loss不断下降,val_loss刚开始下降随后有不断上升!
都有什么原因可能造成这种情况!
| Name | Type | Params
---------------------------------------------------------------------------------------------------------------------------------------
0 | model | AlbertForSequenceClassification | 16 M
1 | model.albert | AlbertModel | 16 M
2 | model.albert.embeddings | AlbertEmbeddings | 2 M
3 | model.albert.embeddings.word_embeddings | Embedding | 2 M
4 | model.albert.embeddings.position_embeddings | Embedding | 65 K
5 | model.albert.embeddings.token_type_embeddings | Embedding | 256
6 | model.albert.embeddings.LayerNorm | LayerNorm | 256
7 | model.albert.embeddings.dropout | Dropout | 0
8 | model.albert.encoder | AlbertTransformer | 12 M
9 | model.albert.encoder.embedding_hidden_mapping_in | Linear | 132 K
10 | model.albert.encoder.albert_layer_groups | ModuleList | 12 M
11 | model.albert.encoder.albert_layer_groups.0 | AlbertLayerGroup | 12 M
12 | model.albert.encoder.albert_layer_groups.0.albert_layers | ModuleList | 12 M
13 | model.albert.encoder.albert_layer_groups.0.albert_layers.0 | AlbertLayer | 12 M
14 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.full_layer_layer_norm | LayerNorm | 2 K
15 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention | AlbertAttention | 4 M
16 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.query | Linear | 1 M
17 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.key | Linear | 1 M
18 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.value | Linear | 1 M
19 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.attention_dropout | Dropout | 0
20 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.output_dropout | Dropout | 0
21 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.dense | Linear | 1 M
22 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.LayerNorm | LayerNorm | 2 K
23 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.ffn | Linear | 4 M
24 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.ffn_output | Linear | 4 M
25 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.activation | ReLU | 0
26 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.dropout | Dropout | 0
27 | model.albert.pooler | Linear | 1 M
28 | model.albert.pooler_activation | Tanh | 0
29 | model.dropout | Dropout | 0
30 | model.classifier | Linear | 7 K
31 | criterion | BCEWithLogitsLoss | 0
Epoch 1: 89%|████████▉ | 284/320 [03:44<00:28, 1.26it/s, loss=0.405, v_num=9]
Validating: 0it [00:00, ?it/s]
Epoch 1: 90%|█████████ | 289/320 [03:44<00:24, 1.28it/s, loss=0.405, v_num=9]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.83it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:04, 5.65it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.87it/s]
Epoch 1: 92%|█████████▏| 294/320 [03:46<00:20, 1.30it/s, loss=0.405, v_num=9]
Validating: 31%|███ | 11/36 [00:02<00:06, 4.06it/s]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.86it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.73it/s]
Validating: 39%|███▉ | 14/36 [00:02<00:06, 3.61it/s]
Epoch 1: 93%|█████████▎| 299/320 [03:47<00:16, 1.31it/s, loss=0.405, v_num=9]
Validating: 44%|████▍ | 16/36 [00:03<00:05, 3.49it/s]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.45it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.39it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.39it/s]
Epoch 1: 95%|█████████▌| 304/320 [03:49<00:12, 1.33it/s, loss=0.405, v_num=9]
Validating: 58%|█████▊ | 21/36 [00:05<00:04, 3.39it/s]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.37it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.38it/s]
Validating: 67%|██████▋ | 24/36 [00:05<00:03, 3.37it/s]
Epoch 1: 97%|█████████▋| 309/320 [03:50<00:08, 1.34it/s, loss=0.405, v_num=9]
Validating: 72%|███████▏ | 26/36 [00:06<00:02, 3.37it/s]
Validating: 75%|███████▌ | 27/36 [00:06<00:02, 3.35it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.36it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.37it/s]
Epoch 1: 98%|█████████▊| 314/320 [03:52<00:04, 1.35it/s, loss=0.405, v_num=9]
Validating: 86%|████████▌ | 31/36 [00:08<00:01, 3.36it/s]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.37it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.37it/s]
Validating: 94%|█████████▍| 34/36 [00:08<00:00, 3.38it/s]
Epoch 1: 100%|█████████▉| 319/320 [03:53<00:00, 1.36it/s, loss=0.405, v_num=9]
Epoch 1: 100%|██████████| 320/320 [03:55<00:00, 1.36it/s, loss=0.405, v_num=9, val_loss=0.333]
D:\albert\albertcls\pytorch_lightning\utilities\distributed.py:23: UserWarning: Did not find hyperparameters at model hparams. Saving checkpoint without hyperparameters.
warnings.warn(*args, **kwargs)
Epoch 2: 89%|████████▉ | 284/320 [03:50<00:29, 1.23it/s, loss=0.295, v_num=9, val_loss=0.333]
Epoch 2: 89%|████████▉ | 285/320 [03:50<00:28, 1.24it/s, loss=0.295, v_num=9, val_loss=0.333]
Epoch 2: 91%|█████████ | 290/320 [03:51<00:23, 1.25it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.69it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.58it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.80it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:05, 4.35it/s]
Epoch 2: 92%|█████████▏| 295/320 [03:52<00:19, 1.27it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.79it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.66it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.57it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.47it/s]
Epoch 2: 94%|█████████▍| 300/320 [03:54<00:15, 1.28it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.40it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.38it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.38it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.34it/s]
Epoch 2: 95%|█████████▌| 305/320 [03:55<00:11, 1.29it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.32it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.35it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.35it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.32it/s]
Epoch 2: 97%|█████████▋| 310/320 [03:57<00:07, 1.31it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating: 75%|███████▌ | 27/36 [00:06<00:02, 3.35it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.32it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.34it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.32it/s]
Epoch 2: 98%|█████████▊| 315/320 [03:58<00:03, 1.32it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.32it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.35it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.35it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.33it/s]
Epoch 2: 100%|██████████| 320/320 [04:01<00:00, 1.33it/s, loss=0.295, v_num=9, val_loss=0.252]
Epoch 3: 89%|████████▉ | 284/320 [03:52<00:29, 1.22it/s, loss=0.236, v_num=9, val_loss=0.252]
Epoch 3: 89%|████████▉ | 285/320 [03:52<00:28, 1.23it/s, loss=0.236, v_num=9, val_loss=0.252]
Epoch 3: 91%|█████████ | 290/320 [03:53<00:24, 1.24it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.64it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.53it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.76it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.32it/s]
Epoch 3: 92%|█████████▏| 295/320 [03:54<00:19, 1.26it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.75it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.65it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.51it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.48it/s]
Epoch 3: 94%|█████████▍| 300/320 [03:56<00:15, 1.27it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.39it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.35it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.31it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.34it/s]
Epoch 3: 95%|█████████▌| 305/320 [03:57<00:11, 1.28it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.30it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.29it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.32it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.29it/s]
Epoch 3: 97%|█████████▋| 310/320 [03:59<00:07, 1.30it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating: 75%|███████▌ | 27/36 [00:06<00:02, 3.31it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.29it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.32it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.30it/s]
Epoch 3: 98%|█████████▊| 315/320 [04:00<00:03, 1.31it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.32it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.30it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.29it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.32it/s]
Epoch 3: 100%|██████████| 320/320 [04:03<00:00, 1.32it/s, loss=0.236, v_num=9, val_loss=0.215]
Epoch 4: 89%|████████▉ | 284/320 [03:53<00:29, 1.22it/s, loss=0.200, v_num=9, val_loss=0.215]
Epoch 4: 89%|████████▉ | 285/320 [03:53<00:28, 1.22it/s, loss=0.200, v_num=9, val_loss=0.215]
Epoch 4: 91%|█████████ | 290/320 [03:54<00:24, 1.24it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.62it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.53it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.76it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.28it/s]
Epoch 4: 92%|█████████▏| 295/320 [03:55<00:19, 1.25it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.76it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.60it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.54it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.45it/s]
Epoch 4: 94%|█████████▍| 300/320 [03:57<00:15, 1.26it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.39it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.35it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.32it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.29it/s]
Epoch 4: 95%|█████████▌| 305/320 [03:58<00:11, 1.28it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.28it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.27it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.28it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.30it/s]
Epoch 4: 97%|█████████▋| 310/320 [04:00<00:07, 1.29it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.28it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.30it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.30it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.28it/s]
Epoch 4: 98%|█████████▊| 315/320 [04:01<00:03, 1.30it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.25it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.29it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.28it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.26it/s]
Epoch 4: 100%|██████████| 320/320 [04:04<00:00, 1.31it/s, loss=0.200, v_num=9, val_loss=0.197]
Epoch 5: 89%|████████▉ | 284/320 [03:54<00:29, 1.21it/s, loss=0.168, v_num=9, val_loss=0.197]
Epoch 5: 89%|████████▉ | 285/320 [03:54<00:28, 1.22it/s, loss=0.168, v_num=9, val_loss=0.197]
Epoch 5: 91%|█████████ | 290/320 [03:54<00:24, 1.23it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.65it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.44it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.79it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.27it/s]
Epoch 5: 92%|█████████▏| 295/320 [03:56<00:20, 1.25it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.79it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.61it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.50it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.45it/s]
Epoch 5: 94%|█████████▍| 300/320 [03:57<00:15, 1.26it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.35it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.31it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.30it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.32it/s]
Epoch 5: 95%|█████████▌| 305/320 [03:59<00:11, 1.27it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.29it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.27it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.26it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.30it/s]
Epoch 5: 97%|█████████▋| 310/320 [04:00<00:07, 1.29it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.28it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.26it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.30it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.29it/s]
Epoch 5: 98%|█████████▊| 315/320 [04:02<00:03, 1.30it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.30it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.28it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.30it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.29it/s]
Epoch 5: 100%|██████████| 320/320 [04:05<00:00, 1.31it/s, loss=0.168, v_num=9, val_loss=0.19]
Epoch 6: 89%|████████▉ | 284/320 [03:54<00:29, 1.21it/s, loss=0.136, v_num=9, val_loss=0.19]
Epoch 6: 89%|████████▉ | 285/320 [03:54<00:28, 1.21it/s, loss=0.136, v_num=9, val_loss=0.19]
Epoch 6: 91%|█████████ | 290/320 [03:55<00:24, 1.23it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.63it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.47it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.72it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.25it/s]
Epoch 6: 92%|█████████▏| 295/320 [03:56<00:20, 1.25it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.76it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.58it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.53it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.44it/s]
Epoch 6: 94%|█████████▍| 300/320 [03:58<00:15, 1.26it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.34it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.31it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.33it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.30it/s]
Epoch 6: 95%|█████████▌| 305/320 [03:59<00:11, 1.27it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.31it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.29it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.27it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.30it/s]
Epoch 6: 97%|█████████▋| 310/320 [04:01<00:07, 1.28it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.27it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.27it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.30it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.28it/s]
Epoch 6: 98%|█████████▊| 315/320 [04:02<00:03, 1.30it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.27it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.29it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.27it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.27it/s]
Epoch 6: 100%|██████████| 320/320 [04:05<00:00, 1.30it/s, loss=0.136, v_num=9, val_loss=0.195]
Epoch 7: 89%|████████▉ | 284/320 [03:54<00:29, 1.21it/s, loss=0.110, v_num=9, val_loss=0.195]
Epoch 7: 89%|████████▉ | 285/320 [03:54<00:28, 1.21it/s, loss=0.110, v_num=9, val_loss=0.195]
Epoch 7: 91%|█████████ | 290/320 [03:55<00:24, 1.23it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.58it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.48it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.75it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.28it/s]
Epoch 7: 92%|█████████▏| 295/320 [03:57<00:20, 1.24it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.72it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.63it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.51it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.41it/s]
Epoch 7: 94%|█████████▍| 300/320 [03:58<00:15, 1.26it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.36it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.32it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.30it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.33it/s]
Epoch 7: 95%|█████████▌| 305/320 [04:00<00:11, 1.27it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.28it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.27it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.30it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.28it/s]
Epoch 7: 97%|█████████▋| 310/320 [04:01<00:07, 1.28it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.26it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.30it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.28it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.27it/s]
Epoch 7: 98%|█████████▊| 315/320 [04:03<00:03, 1.30it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.30it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.29it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.26it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.29it/s]
Epoch 7: 100%|██████████| 320/320 [04:05<00:00, 1.30it/s, loss=0.110, v_num=9, val_loss=0.205]
Epoch 8: 89%|████████▉ | 284/320 [03:54<00:29, 1.21it/s, loss=0.083, v_num=9, val_loss=0.205]
Epoch 8: 89%|████████▉ | 285/320 [03:54<00:28, 1.21it/s, loss=0.083, v_num=9, val_loss=0.205]
Epoch 8: 91%|█████████ | 290/320 [03:55<00:24, 1.23it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.64it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.47it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.72it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.26it/s]
Epoch 8: 92%|█████████▏| 295/320 [03:57<00:20, 1.24it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.75it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.58it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.52it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.44it/s]
Epoch 8: 94%|█████████▍| 300/320 [03:58<00:15, 1.26it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.36it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.33it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.30it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.29it/s]
Epoch 8: 95%|█████████▌| 305/320 [04:00<00:11, 1.27it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.31it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.29it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.27it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.31it/s]
Epoch 8: 97%|█████████▋| 310/320 [04:01<00:07, 1.28it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.27it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.26it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.25it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.25it/s]
Epoch 8: 98%|█████████▊| 315/320 [04:03<00:03, 1.30it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.27it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.26it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.26it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.29it/s]
Epoch 8: 100%|██████████| 320/320 [04:05<00:00, 1.30it/s, loss=0.083, v_num=9, val_loss=0.213]
Epoch 9: 89%|████████▉ | 284/320 [03:55<00:29, 1.21it/s, loss=0.063, v_num=9, val_loss=0.213]
Epoch 9: 89%|████████▉ | 285/320 [03:55<00:28, 1.21it/s, loss=0.063, v_num=9, val_loss=0.213]
Epoch 9: 91%|█████████ | 290/320 [03:55<00:24, 1.23it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.55it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.48it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.74it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.26it/s]
Epoch 9: 92%|█████████▏| 295/320 [03:57<00:20, 1.24it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.75it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.59it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.48it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.45it/s]
Epoch 9: 94%|█████████▍| 300/320 [03:58<00:15, 1.26it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.35it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.32it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.29it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.28it/s]
Epoch 9: 95%|█████████▌| 305/320 [04:00<00:11, 1.27it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.30it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.27it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.26it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.26it/s]
Epoch 9: 97%|█████████▋| 310/320 [04:02<00:07, 1.28it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.28it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.27it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.26it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.25it/s]
Epoch 9: 98%|█████████▊| 315/320 [04:03<00:03, 1.29it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.27it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.26it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.26it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.29it/s]
Epoch 9: 100%|██████████| 320/320 [04:06<00:00, 1.30it/s, loss=0.063, v_num=9, val_loss=0.217]
Epoch 10: 89%|████████▉ | 284/320 [03:55<00:29, 1.21it/s, loss=0.052, v_num=9, val_loss=0.217]
Epoch 10: 89%|████████▉ | 285/320 [03:55<00:28, 1.21it/s, loss=0.052, v_num=9, val_loss=0.217]
Epoch 10: 91%|█████████ | 290/320 [03:55<00:24, 1.23it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating: 19%|█▉ | 7/36 [00:00<00:04, 6.61it/s]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.43it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.77it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.28it/s]
Epoch 10: 92%|█████████▏| 295/320 [03:57<00:20, 1.24it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.73it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.58it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.52it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.44it/s]
Epoch 10: 94%|█████████▍| 300/320 [03:58<00:15, 1.26it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating: 47%|████▋ | 17/36 [00:03<00:05, 3.33it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.31it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.29it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.31it/s]
Epoch 10: 95%|█████████▌| 305/320 [04:00<00:11, 1.27it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.27it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.26it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.29it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.28it/s]
Epoch 10: 97%|█████████▋| 310/320 [04:02<00:07, 1.28it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.26it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.25it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.25it/s]
Validating: 83%|████████▎ | 30/36 [00:07<00:01, 3.29it/s]
Epoch 10: 98%|█████████▊| 315/320 [04:03<00:03, 1.29it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.26it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.29it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.28it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.27it/s]
Epoch 10: 100%|██████████| 320/320 [04:06<00:00, 1.30it/s, loss=0.052, v_num=9, val_loss=0.236]
Epoch 11: 89%|████████▉ | 284/320 [03:56<00:29, 1.20it/s, loss=0.045, v_num=9, val_loss=0.236]
Epoch 11: 89%|████████▉ | 285/320 [03:56<00:29, 1.21it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 8%|▊ | 3/36 [00:00<00:01, 28.40it/s]
Epoch 11: 91%|█████████ | 290/320 [03:57<00:24, 1.22it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 22%|██▏ | 8/36 [00:01<00:05, 5.34it/s]
Validating: 25%|██▌ | 9/36 [00:01<00:05, 4.73it/s]
Validating: 28%|██▊ | 10/36 [00:01<00:06, 4.28it/s]
Epoch 11: 92%|█████████▏| 295/320 [03:58<00:20, 1.24it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 33%|███▎ | 12/36 [00:02<00:06, 3.76it/s]
Validating: 36%|███▌ | 13/36 [00:02<00:06, 3.62it/s]
Validating: 39%|███▉ | 14/36 [00:03<00:06, 3.53it/s]
Validating: 42%|████▏ | 15/36 [00:03<00:06, 3.42it/s]
Epoch 11: 94%|█████████▍| 300/320 [04:00<00:16, 1.25it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 47%|████▋ | 17/36 [00:04<00:05, 3.32it/s]
Validating: 50%|█████ | 18/36 [00:04<00:05, 3.32it/s]
Validating: 53%|█████▎ | 19/36 [00:04<00:05, 3.27it/s]
Validating: 56%|█████▌ | 20/36 [00:04<00:04, 3.29it/s]
Epoch 11: 95%|█████████▌| 305/320 [04:01<00:11, 1.26it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 61%|██████ | 22/36 [00:05<00:04, 3.28it/s]
Validating: 64%|██████▍ | 23/36 [00:05<00:03, 3.26it/s]
Validating: 67%|██████▋ | 24/36 [00:06<00:03, 3.28it/s]
Validating: 69%|██████▉ | 25/36 [00:06<00:03, 3.24it/s]
Epoch 11: 97%|█████████▋| 310/320 [04:03<00:07, 1.27it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 75%|███████▌ | 27/36 [00:07<00:02, 3.27it/s]
Validating: 78%|███████▊ | 28/36 [00:07<00:02, 3.25it/s]
Validating: 81%|████████ | 29/36 [00:07<00:02, 3.28it/s]
Validating: 83%|████████▎ | 30/36 [00:08<00:01, 3.27it/s]
Epoch 11: 98%|█████████▊| 315/320 [04:04<00:03, 1.29it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating: 89%|████████▉ | 32/36 [00:08<00:01, 3.26it/s]
Validating: 92%|█████████▏| 33/36 [00:08<00:00, 3.29it/s]
Validating: 94%|█████████▍| 34/36 [00:09<00:00, 3.26it/s]
Validating: 97%|█████████▋| 35/36 [00:09<00:00, 3.23it/s]
Epoch 11: 100%|██████████| 320/320 [04:07<00:00, 1.29it/s, loss=0.045, v_num=9, val_loss=0.232]
过拟合了,训练集效果好,但测试集不好。
如果模型在训练集、验证集、测试集的表现都很好,但是在实际用的新数据表现很差,可能的问题:
分布不一致,新数据与原数据的特征之间存在差异,网络对新数据特征的提取能力不足。