最近在使用YOLOv5训练模型,相同的代码相同的数据公司的电脑上训练没有问题,可是用自己电脑训练R,P,map值都很低,没有任何报错警告,一开始看了修改方案是说cuda太高了我就重新装了cuda(11.6--->10.2)和对应的torch(torch1.10.0+cu102)
问题还是没有解决
训练开始的配置:
Namespace(adam=False, artifact_alias='latest', batch_size=2, bbox_interval=-1, bucket='', cache_images=False, cfg='', data='data/coco.yaml', device='', entity=None, epochs=20, evolve=False, exist_ok=False, global_rank=-1, hyp='data/hyp.scratch.p5.yaml', image_weights=False, img_size=[640, 640], label_smoothing=0.0, linear_lr=False, local_rank=-1, multi_scale=False, name='exp', noautoanchor=False, nosave=False, notest=False, project='runs/train', quad=False, rect=False, resume=False, save_dir='runs\train\exp35', save_period=-1, single_cls=False, sync_bn=False, total_batch_size=2, upload_dataset=False, weights='yolov7.pt', workers=8, world_size=1)
tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.3, cls_pw=1.0, obj=0.7, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.2, scale=0.9, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.15, copy_paste=0.0, paste_in=0.15
wandb: Install Weights & Biases for YOLOR logging with 'pip install wandb' (recommended)
Overriding model.yaml nc=80 with nc=2
10轮训练的结果:
Epoch gpu_mem box obj cls total labels img_size
0/19 2.49G 0.05646 0.01438 0.009927 0.08077 1 640: 100%|| 122/122 [01:37<00:00, 1.26it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:05<00:00, 1.59it/s]
all 34 0 0 0 0 0
Epoch gpu_mem box obj cls total labels img_size
1/19 2.55G 0.05786 0.01027 0.007407 0.07554 12 640: 100%|| 122/122 [01:48<00:00, 1.13it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:01<00:00, 5.62it/s]
all 34 129 0.000391 0.0155 1e-05 1.29e-06
Epoch gpu_mem box obj cls total labels img_size
2/19 2.55G 0.04771 0.008596 0.003802 0.06011 21 640: 100%| 122/122 [01:45<00:00, 1.15it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|9/9 [00:01<00:00, 5.77it/s]
all 34 129 0.00845 0.00775 0.000287 8.58e-05
Epoch gpu_mem box obj cls total labels img_size
3/19 2.55G 0.04877 0.007403 0.003021 0.05919 10 640: 100%|| 122/122 [01:46<00:00, 1.15it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:01<00:00, 5.89it/s]
all 34 129 0.00987 0.0155 0.000351 4.51e-05
Epoch gpu_mem box obj cls total labels img_size
4/19 2.55G 0.04114 0.006993 0.001971 0.0501 3 640: 100%|| 122/122 [01:54<00:00, 1.07it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%| 9/9 [00:01<00:00, 6.03it/s]
all 34 129 0.0102 0.0388 0.000561 6.6e-05
Epoch gpu_mem box obj cls total labels img_size
5/19 2.55G 0.03607 0.005895 0.001194 0.04316 0 640: 100%|| 122/122 [01:54<00:00, 1.07it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:01<00:00, 5.94it/s]
all 34 129 0.000398 0.031 2.08e-05 4.07e-06
Epoch gpu_mem box obj cls total labels img_size
6/19 2.55G 0.03495 0.005608 0.0009469 0.0415 7 640: 100%|| 122/122 [02:01<00:00, 1.00it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:01<00:00, 5.82it/s]
all 34 129 0.0636 0.0155 0.00309 0.000399
Epoch gpu_mem box obj cls total labels img_size
7/19 2.55G 0.03832 0.00456 0.00065 0.04353 7 640: 100%|| 122/122 [01:44<00:00, 1.17it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%| 9/9 [00:01<00:00, 5.89it/s]
all 34 129 0.00726 0.0155 0.000846 0.000115
Epoch gpu_mem box obj cls total labels img_size
8/19 2.55G 0.03626 0.005036 0.000683 0.04198 5 640: 100% 122/122 [01:52<00:00, 1.08it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:01<00:00, 5.89it/s]
all 34 129 0.00742 0.0853 0.0015 0.000308
Epoch gpu_mem box obj cls total labels img_size
9/19 2.55G 0.03495 0.003713 0.0003132 0.03898 15 640: 100%|| 122/122 [01:53<00:00, 1.08it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|| 9/9 [00:01<00:00, 5.93it/s]
all 34 129 0.0107 0.178 0.00456 0.000785
Epoch gpu_mem box obj cls total labels img_size
10/19 2.55G 0.03505 0.00322 0.0002998 0.03857 5 640: 100% 122/122 [01:50<00:00, 1.11it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%| 9/9 [00:01<00:00, 5.89it/s]
all 34 129 0.012 0.233 0.00505 0.000907
搞了好久一直这样/(ㄒoㄒ)/~~
你确定你的和公司的环境一样吗?或者你换下cpu训练看看?cpu没问题的话就是cuda的问题了,这种的基本上很难一下子确定是啥问题,只能一个一个去排除,有可能是你对应的torchvision没安装对版本等等其他情况,最好是将你的环境配置和公司的一样才能知道哪里出的问题。
你放了几个类别
请问您解决了吗,我也是这种情况
我也是这样,跑coco128的精度特别低,cuda也是对应的显卡的安装的10.2
一般这种情况就是版本过高的问题啦,pytorch和cuda的版本都可能是过高的,建议继续降。可尝试(pytorch1.9.1+cuda10.2)
请问大家有解决这个问题吗 我也是遇到了同样情况