模型训练中,model size和inference time有联系吗?

在我们的固有观念里面认为model size越大inference time随之变大

但在近期做试验的过程中发现U2-Net†仅有4MB,但inference time有371

而U-NET有7MB,inference time为58

 

batch size同为12,同GPU,服务器未运行其他任何程序。

 

所以两者是否真的有联系?

其实算inference time比较复杂吧,这里简单提两个要注意的点: asynchronous execution and GPU warm up

具体细节可以看下这篇文章: https://towardsdatascience.com/the-correct-way-to-measure-inference-time-of-deep-neural-networks-304a54e5187f

下面上一份pytorch算inference time的代码:

import torch
import numpy as np
import torchvision.models as models

model = models.vgg16()
device = torch.device("cuda")
model.to(device)
model.eval()
dummy_input = torch.randn(1, 3, 224, 224, dtype = torch.float).to(device)
starter, ender = torch.cuda.Event(enable_timing = True), torch.cuda.Event(enable_timing = True)
repetitions = 300
timings = np.zeros((repetitions, 1))
#GPU-WARM-UP
for _ in range(10):
   _ = model(dummy_input)
# MEASURE PERFORMANCE
with torch.no_grad():
  for rep in range(repetitions):
     starter.record()
     _ = model(dummy_input)
     ender.record()
     # WAIT FOR GPU SYNC
     torch.cuda.synchronize()
     curr_time = starter.elapsed_time(ender)
     timings[rep] = curr_time
mean_syn = np.sum(timings) / repetitions
std_syn = np.std(timings)
print(mean_syn)
print(std_syn)

 

请多次测量取平均值。

一般来说,模型越小代表速度越快,毕竟需要计算的量变少了很多,但也不是那么的绝对就是了