使用pytorch在评估模型时候报ForkingPickler(file, protocol).dump(obj) MemoryError和EOFError: Ran out of input两个错误

在运行valid.py时候,c盘空间迅速消耗到50g,随后被召回,并报错。
具体错误如下。


Traceback (most recent call last):
  File "run/pose2d/valid.py", line 159, in <module>
    main()
  File "run/pose2d/valid.py", line 154, in main
    validate(config, valid_loader, valid_dataset, model, criterion,
  File "D:\TransFusion-Pose\run\pose2d\..\..\lib\core\function.py", line 233, in validate
    for i, (input, target, weight, meta) in enumerate(loader):
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\site-packages\torch\utils\data\dataloader.py", line 352, in __iter__
    return self._get_iterator()
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\site-packages\torch\utils\data\dataloader.py", line 294, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\site-packages\torch\utils\data\dataloader.py", line 801, in __init__
    w.start()
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
MemoryError
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\spawn.py", line
 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\procedure_for_study\Anaconda3\envs\transpose\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

我个人觉得是线程、内存错误但是不知如何去解决,也不知理解是否正确,麻烦帮忙看看,十分感谢。

令num_workers=0,记得无论是参数设置还是数据集读取处理的代码。都要重新设置

loader = torch.utils.data.DataLoader(dataset=dataset,
                                              batch_size=self.batch,
                                              shuffle=self.shuf,
                                              num_workers=0,
                                              drop_last=True)
 
parser.add_argument('--num_workers', type=int, default=0)