train loss不断下降,val_loss上升是什么原因?

train loss不断下降,val_loss刚开始下降随后有不断上升!
都有什么原因可能造成这种情况!



   | Name                                                                                   | Type                            | Params
---------------------------------------------------------------------------------------------------------------------------------------
0  | model                                                                                  | AlbertForSequenceClassification | 16 M  
1  | model.albert                                                                           | AlbertModel                     | 16 M  
2  | model.albert.embeddings                                                                | AlbertEmbeddings                | 2 M   
3  | model.albert.embeddings.word_embeddings                                                | Embedding                       | 2 M   
4  | model.albert.embeddings.position_embeddings                                            | Embedding                       | 65 K  
5  | model.albert.embeddings.token_type_embeddings                                          | Embedding                       | 256   
6  | model.albert.embeddings.LayerNorm                                                      | LayerNorm                       | 256   
7  | model.albert.embeddings.dropout                                                        | Dropout                         | 0     
8  | model.albert.encoder                                                                   | AlbertTransformer               | 12 M  
9  | model.albert.encoder.embedding_hidden_mapping_in                                       | Linear                          | 132 K 
10 | model.albert.encoder.albert_layer_groups                                               | ModuleList                      | 12 M  
11 | model.albert.encoder.albert_layer_groups.0                                             | AlbertLayerGroup                | 12 M  
12 | model.albert.encoder.albert_layer_groups.0.albert_layers                               | ModuleList                      | 12 M  
13 | model.albert.encoder.albert_layer_groups.0.albert_layers.0                             | AlbertLayer                     | 12 M  
14 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.full_layer_layer_norm       | LayerNorm                       | 2 K   
15 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention                   | AlbertAttention                 | 4 M   
16 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.query             | Linear                          | 1 M   
17 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.key               | Linear                          | 1 M   
18 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.value             | Linear                          | 1 M   
19 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.attention_dropout | Dropout                         | 0     
20 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.output_dropout    | Dropout                         | 0     
21 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.dense             | Linear                          | 1 M   
22 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.attention.LayerNorm         | LayerNorm                       | 2 K   
23 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.ffn                         | Linear                          | 4 M   
24 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.ffn_output                  | Linear                          | 4 M   
25 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.activation                  | ReLU                            | 0     
26 | model.albert.encoder.albert_layer_groups.0.albert_layers.0.dropout                     | Dropout                         | 0     
27 | model.albert.pooler                                                                    | Linear                          | 1 M   
28 | model.albert.pooler_activation                                                         | Tanh                            | 0     
29 | model.dropout                                                                          | Dropout                         | 0     
30 | model.classifier                                                                       | Linear                          | 7 K   
31 | criterion                                                                              | BCEWithLogitsLoss               | 0     
Epoch 1:  89%|████████▉ | 284/320 [03:44<00:28,  1.26it/s, loss=0.405, v_num=9]
Validating: 0it [00:00, ?it/s]
Epoch 1:  90%|█████████ | 289/320 [03:44<00:24,  1.28it/s, loss=0.405, v_num=9]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.83it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:04,  5.65it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.87it/s]
Epoch 1:  92%|█████████▏| 294/320 [03:46<00:20,  1.30it/s, loss=0.405, v_num=9]
Validating:  31%|███       | 11/36 [00:02<00:06,  4.06it/s]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.86it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.73it/s]
Validating:  39%|███▉      | 14/36 [00:02<00:06,  3.61it/s]
Epoch 1:  93%|█████████▎| 299/320 [03:47<00:16,  1.31it/s, loss=0.405, v_num=9]
Validating:  44%|████▍     | 16/36 [00:03<00:05,  3.49it/s]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.45it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.39it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.39it/s]
Epoch 1:  95%|█████████▌| 304/320 [03:49<00:12,  1.33it/s, loss=0.405, v_num=9]
Validating:  58%|█████▊    | 21/36 [00:05<00:04,  3.39it/s]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.37it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.38it/s]
Validating:  67%|██████▋   | 24/36 [00:05<00:03,  3.37it/s]
Epoch 1:  97%|█████████▋| 309/320 [03:50<00:08,  1.34it/s, loss=0.405, v_num=9]
Validating:  72%|███████▏  | 26/36 [00:06<00:02,  3.37it/s]
Validating:  75%|███████▌  | 27/36 [00:06<00:02,  3.35it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.36it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.37it/s]
Epoch 1:  98%|█████████▊| 314/320 [03:52<00:04,  1.35it/s, loss=0.405, v_num=9]
Validating:  86%|████████▌ | 31/36 [00:08<00:01,  3.36it/s]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.37it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.37it/s]
Validating:  94%|█████████▍| 34/36 [00:08<00:00,  3.38it/s]
Epoch 1: 100%|█████████▉| 319/320 [03:53<00:00,  1.36it/s, loss=0.405, v_num=9]
Epoch 1: 100%|██████████| 320/320 [03:55<00:00,  1.36it/s, loss=0.405, v_num=9, val_loss=0.333]
D:\albert\albertcls\pytorch_lightning\utilities\distributed.py:23: UserWarning: Did not find hyperparameters at model hparams. Saving checkpoint without hyperparameters.
  warnings.warn(*args, **kwargs)
Epoch 2:  89%|████████▉ | 284/320 [03:50<00:29,  1.23it/s, loss=0.295, v_num=9, val_loss=0.333]
Epoch 2:  89%|████████▉ | 285/320 [03:50<00:28,  1.24it/s, loss=0.295, v_num=9, val_loss=0.333]
Epoch 2:  91%|█████████ | 290/320 [03:51<00:23,  1.25it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.69it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.58it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.80it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:05,  4.35it/s]
Epoch 2:  92%|█████████▏| 295/320 [03:52<00:19,  1.27it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.79it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.66it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.57it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.47it/s]
Epoch 2:  94%|█████████▍| 300/320 [03:54<00:15,  1.28it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.40it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.38it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.38it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.34it/s]
Epoch 2:  95%|█████████▌| 305/320 [03:55<00:11,  1.29it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.32it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.35it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.35it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.32it/s]
Epoch 2:  97%|█████████▋| 310/320 [03:57<00:07,  1.31it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating:  75%|███████▌  | 27/36 [00:06<00:02,  3.35it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.32it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.34it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.32it/s]
Epoch 2:  98%|█████████▊| 315/320 [03:58<00:03,  1.32it/s, loss=0.295, v_num=9, val_loss=0.333]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.32it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.35it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.35it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.33it/s]
Epoch 2: 100%|██████████| 320/320 [04:01<00:00,  1.33it/s, loss=0.295, v_num=9, val_loss=0.252]
Epoch 3:  89%|████████▉ | 284/320 [03:52<00:29,  1.22it/s, loss=0.236, v_num=9, val_loss=0.252]
Epoch 3:  89%|████████▉ | 285/320 [03:52<00:28,  1.23it/s, loss=0.236, v_num=9, val_loss=0.252]
Epoch 3:  91%|█████████ | 290/320 [03:53<00:24,  1.24it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.64it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.53it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.76it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.32it/s]
Epoch 3:  92%|█████████▏| 295/320 [03:54<00:19,  1.26it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.75it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.65it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.51it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.48it/s]
Epoch 3:  94%|█████████▍| 300/320 [03:56<00:15,  1.27it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.39it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.35it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.31it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.34it/s]
Epoch 3:  95%|█████████▌| 305/320 [03:57<00:11,  1.28it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.30it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.29it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.32it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.29it/s]
Epoch 3:  97%|█████████▋| 310/320 [03:59<00:07,  1.30it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating:  75%|███████▌  | 27/36 [00:06<00:02,  3.31it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.29it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.32it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.30it/s]
Epoch 3:  98%|█████████▊| 315/320 [04:00<00:03,  1.31it/s, loss=0.236, v_num=9, val_loss=0.252]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.32it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.30it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.29it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.32it/s]
Epoch 3: 100%|██████████| 320/320 [04:03<00:00,  1.32it/s, loss=0.236, v_num=9, val_loss=0.215]
Epoch 4:  89%|████████▉ | 284/320 [03:53<00:29,  1.22it/s, loss=0.200, v_num=9, val_loss=0.215]
Epoch 4:  89%|████████▉ | 285/320 [03:53<00:28,  1.22it/s, loss=0.200, v_num=9, val_loss=0.215]
Epoch 4:  91%|█████████ | 290/320 [03:54<00:24,  1.24it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.62it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.53it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.76it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.28it/s]
Epoch 4:  92%|█████████▏| 295/320 [03:55<00:19,  1.25it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.76it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.60it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.54it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.45it/s]
Epoch 4:  94%|█████████▍| 300/320 [03:57<00:15,  1.26it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.39it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.35it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.32it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.29it/s]
Epoch 4:  95%|█████████▌| 305/320 [03:58<00:11,  1.28it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.28it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.27it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.28it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.30it/s]
Epoch 4:  97%|█████████▋| 310/320 [04:00<00:07,  1.29it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.28it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.30it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.30it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.28it/s]
Epoch 4:  98%|█████████▊| 315/320 [04:01<00:03,  1.30it/s, loss=0.200, v_num=9, val_loss=0.215]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.25it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.29it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.28it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.26it/s]
Epoch 4: 100%|██████████| 320/320 [04:04<00:00,  1.31it/s, loss=0.200, v_num=9, val_loss=0.197]
Epoch 5:  89%|████████▉ | 284/320 [03:54<00:29,  1.21it/s, loss=0.168, v_num=9, val_loss=0.197]
Epoch 5:  89%|████████▉ | 285/320 [03:54<00:28,  1.22it/s, loss=0.168, v_num=9, val_loss=0.197]
Epoch 5:  91%|█████████ | 290/320 [03:54<00:24,  1.23it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.65it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.44it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.79it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.27it/s]
Epoch 5:  92%|█████████▏| 295/320 [03:56<00:20,  1.25it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.79it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.61it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.50it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.45it/s]
Epoch 5:  94%|█████████▍| 300/320 [03:57<00:15,  1.26it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.35it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.31it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.30it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.32it/s]
Epoch 5:  95%|█████████▌| 305/320 [03:59<00:11,  1.27it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.29it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.27it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.26it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.30it/s]
Epoch 5:  97%|█████████▋| 310/320 [04:00<00:07,  1.29it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.28it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.26it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.30it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.29it/s]
Epoch 5:  98%|█████████▊| 315/320 [04:02<00:03,  1.30it/s, loss=0.168, v_num=9, val_loss=0.197]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.30it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.28it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.30it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.29it/s]
Epoch 5: 100%|██████████| 320/320 [04:05<00:00,  1.31it/s, loss=0.168, v_num=9, val_loss=0.19] 
Epoch 6:  89%|████████▉ | 284/320 [03:54<00:29,  1.21it/s, loss=0.136, v_num=9, val_loss=0.19]
Epoch 6:  89%|████████▉ | 285/320 [03:54<00:28,  1.21it/s, loss=0.136, v_num=9, val_loss=0.19]
Epoch 6:  91%|█████████ | 290/320 [03:55<00:24,  1.23it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.63it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.47it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.72it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.25it/s]
Epoch 6:  92%|█████████▏| 295/320 [03:56<00:20,  1.25it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.76it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.58it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.53it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.44it/s]
Epoch 6:  94%|█████████▍| 300/320 [03:58<00:15,  1.26it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.34it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.31it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.33it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.30it/s]
Epoch 6:  95%|█████████▌| 305/320 [03:59<00:11,  1.27it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.31it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.29it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.27it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.30it/s]
Epoch 6:  97%|█████████▋| 310/320 [04:01<00:07,  1.28it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.27it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.27it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.30it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.28it/s]
Epoch 6:  98%|█████████▊| 315/320 [04:02<00:03,  1.30it/s, loss=0.136, v_num=9, val_loss=0.19]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.27it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.29it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.27it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.27it/s]
Epoch 6: 100%|██████████| 320/320 [04:05<00:00,  1.30it/s, loss=0.136, v_num=9, val_loss=0.195]
Epoch 7:  89%|████████▉ | 284/320 [03:54<00:29,  1.21it/s, loss=0.110, v_num=9, val_loss=0.195]
Epoch 7:  89%|████████▉ | 285/320 [03:54<00:28,  1.21it/s, loss=0.110, v_num=9, val_loss=0.195]
Epoch 7:  91%|█████████ | 290/320 [03:55<00:24,  1.23it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.58it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.48it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.75it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.28it/s]
Epoch 7:  92%|█████████▏| 295/320 [03:57<00:20,  1.24it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.72it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.63it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.51it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.41it/s]
Epoch 7:  94%|█████████▍| 300/320 [03:58<00:15,  1.26it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.36it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.32it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.30it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.33it/s]
Epoch 7:  95%|█████████▌| 305/320 [04:00<00:11,  1.27it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.28it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.27it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.30it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.28it/s]
Epoch 7:  97%|█████████▋| 310/320 [04:01<00:07,  1.28it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.26it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.30it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.28it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.27it/s]
Epoch 7:  98%|█████████▊| 315/320 [04:03<00:03,  1.30it/s, loss=0.110, v_num=9, val_loss=0.195]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.30it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.29it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.26it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.29it/s]
Epoch 7: 100%|██████████| 320/320 [04:05<00:00,  1.30it/s, loss=0.110, v_num=9, val_loss=0.205]
Epoch 8:  89%|████████▉ | 284/320 [03:54<00:29,  1.21it/s, loss=0.083, v_num=9, val_loss=0.205]
Epoch 8:  89%|████████▉ | 285/320 [03:54<00:28,  1.21it/s, loss=0.083, v_num=9, val_loss=0.205]
Epoch 8:  91%|█████████ | 290/320 [03:55<00:24,  1.23it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.64it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.47it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.72it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.26it/s]
Epoch 8:  92%|█████████▏| 295/320 [03:57<00:20,  1.24it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.75it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.58it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.52it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.44it/s]
Epoch 8:  94%|█████████▍| 300/320 [03:58<00:15,  1.26it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.36it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.33it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.30it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.29it/s]
Epoch 8:  95%|█████████▌| 305/320 [04:00<00:11,  1.27it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.31it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.29it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.27it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.31it/s]
Epoch 8:  97%|█████████▋| 310/320 [04:01<00:07,  1.28it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.27it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.26it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.25it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.25it/s]
Epoch 8:  98%|█████████▊| 315/320 [04:03<00:03,  1.30it/s, loss=0.083, v_num=9, val_loss=0.205]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.27it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.26it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.26it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.29it/s]
Epoch 8: 100%|██████████| 320/320 [04:05<00:00,  1.30it/s, loss=0.083, v_num=9, val_loss=0.213]
Epoch 9:  89%|████████▉ | 284/320 [03:55<00:29,  1.21it/s, loss=0.063, v_num=9, val_loss=0.213]
Epoch 9:  89%|████████▉ | 285/320 [03:55<00:28,  1.21it/s, loss=0.063, v_num=9, val_loss=0.213]
Epoch 9:  91%|█████████ | 290/320 [03:55<00:24,  1.23it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.55it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.48it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.74it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.26it/s]
Epoch 9:  92%|█████████▏| 295/320 [03:57<00:20,  1.24it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.75it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.59it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.48it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.45it/s]
Epoch 9:  94%|█████████▍| 300/320 [03:58<00:15,  1.26it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.35it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.32it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.29it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.28it/s]
Epoch 9:  95%|█████████▌| 305/320 [04:00<00:11,  1.27it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.30it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.27it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.26it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.26it/s]
Epoch 9:  97%|█████████▋| 310/320 [04:02<00:07,  1.28it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.28it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.27it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.26it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.25it/s]
Epoch 9:  98%|█████████▊| 315/320 [04:03<00:03,  1.29it/s, loss=0.063, v_num=9, val_loss=0.213]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.27it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.26it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.26it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.29it/s]
Epoch 9: 100%|██████████| 320/320 [04:06<00:00,  1.30it/s, loss=0.063, v_num=9, val_loss=0.217]
Epoch 10:  89%|████████▉ | 284/320 [03:55<00:29,  1.21it/s, loss=0.052, v_num=9, val_loss=0.217]
Epoch 10:  89%|████████▉ | 285/320 [03:55<00:28,  1.21it/s, loss=0.052, v_num=9, val_loss=0.217]
Epoch 10:  91%|█████████ | 290/320 [03:55<00:24,  1.23it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating:  19%|█▉        | 7/36 [00:00<00:04,  6.61it/s]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.43it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.77it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.28it/s]
Epoch 10:  92%|█████████▏| 295/320 [03:57<00:20,  1.24it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.73it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.58it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.52it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.44it/s]
Epoch 10:  94%|█████████▍| 300/320 [03:58<00:15,  1.26it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating:  47%|████▋     | 17/36 [00:03<00:05,  3.33it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.31it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.29it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.31it/s]
Epoch 10:  95%|█████████▌| 305/320 [04:00<00:11,  1.27it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.27it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.26it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.29it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.28it/s]
Epoch 10:  97%|█████████▋| 310/320 [04:02<00:07,  1.28it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.26it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.25it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.25it/s]
Validating:  83%|████████▎ | 30/36 [00:07<00:01,  3.29it/s]
Epoch 10:  98%|█████████▊| 315/320 [04:03<00:03,  1.29it/s, loss=0.052, v_num=9, val_loss=0.217]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.26it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.29it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.28it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.27it/s]
Epoch 10: 100%|██████████| 320/320 [04:06<00:00,  1.30it/s, loss=0.052, v_num=9, val_loss=0.236]
Epoch 11:  89%|████████▉ | 284/320 [03:56<00:29,  1.20it/s, loss=0.045, v_num=9, val_loss=0.236]
Epoch 11:  89%|████████▉ | 285/320 [03:56<00:29,  1.21it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:   8%|| 3/36 [00:00<00:01, 28.40it/s]
Epoch 11:  91%|█████████ | 290/320 [03:57<00:24,  1.22it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:  22%|██▏       | 8/36 [00:01<00:05,  5.34it/s]
Validating:  25%|██▌       | 9/36 [00:01<00:05,  4.73it/s]
Validating:  28%|██▊       | 10/36 [00:01<00:06,  4.28it/s]
Epoch 11:  92%|█████████▏| 295/320 [03:58<00:20,  1.24it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:  33%|███▎      | 12/36 [00:02<00:06,  3.76it/s]
Validating:  36%|███▌      | 13/36 [00:02<00:06,  3.62it/s]
Validating:  39%|███▉      | 14/36 [00:03<00:06,  3.53it/s]
Validating:  42%|████▏     | 15/36 [00:03<00:06,  3.42it/s]
Epoch 11:  94%|█████████▍| 300/320 [04:00<00:16,  1.25it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:  47%|████▋     | 17/36 [00:04<00:05,  3.32it/s]
Validating:  50%|█████     | 18/36 [00:04<00:05,  3.32it/s]
Validating:  53%|█████▎    | 19/36 [00:04<00:05,  3.27it/s]
Validating:  56%|█████▌    | 20/36 [00:04<00:04,  3.29it/s]
Epoch 11:  95%|█████████▌| 305/320 [04:01<00:11,  1.26it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:  61%|██████    | 22/36 [00:05<00:04,  3.28it/s]
Validating:  64%|██████▍   | 23/36 [00:05<00:03,  3.26it/s]
Validating:  67%|██████▋   | 24/36 [00:06<00:03,  3.28it/s]
Validating:  69%|██████▉   | 25/36 [00:06<00:03,  3.24it/s]
Epoch 11:  97%|█████████▋| 310/320 [04:03<00:07,  1.27it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:  75%|███████▌  | 27/36 [00:07<00:02,  3.27it/s]
Validating:  78%|███████▊  | 28/36 [00:07<00:02,  3.25it/s]
Validating:  81%|████████  | 29/36 [00:07<00:02,  3.28it/s]
Validating:  83%|████████▎ | 30/36 [00:08<00:01,  3.27it/s]
Epoch 11:  98%|█████████▊| 315/320 [04:04<00:03,  1.29it/s, loss=0.045, v_num=9, val_loss=0.236]
Validating:  89%|████████▉ | 32/36 [00:08<00:01,  3.26it/s]
Validating:  92%|█████████▏| 33/36 [00:08<00:00,  3.29it/s]
Validating:  94%|█████████▍| 34/36 [00:09<00:00,  3.26it/s]
Validating:  97%|█████████▋| 35/36 [00:09<00:00,  3.23it/s]
Epoch 11: 100%|██████████| 320/320 [04:07<00:00,  1.29it/s, loss=0.045, v_num=9, val_loss=0.232]



过拟合了,训练集效果好,但测试集不好。

  • 这篇博客: 关于 train loss、val loss训练时遇到的问题中的 数据集说明 部分也许能够解决你的问题, 你可以仔细阅读以下内容或跳转源博客中阅读:
    • 训练集是用于模型的训练的样本集合,确定模型的权重参数。
      • 训练集的数量随着模型的复杂度要增多。
      • 反向传播确定最优参数。
    • 验证集用于验证模型的评估、模型的选择、参数的调整。
      • 选择模型、调整超参、初步评估模型。
    • 测试集是用于模型的无偏估计。
      • 再找个集合评估模型看看是否是偶然稳定,即验证无偏性。
    • 保证同分布,最好保证测试集的正负比和实际环境的一致。

    如果模型在训练集、验证集、测试集的表现都很好,但是在实际用的新数据表现很差,可能的问题:

    分布不一致,新数据与原数据的特征之间存在差异,网络对新数据特征的提取能力不足。