模型训练中，model size和inference time有联系吗？

在我们的固有观念里面认为model size越大inference time随之变大

但在近期做试验的过程中发现U2-Net†仅有4MB，但inference time有371

而U-NET有7MB，inference time为58

batch size同为12，同GPU，服务器未运行其他任何程序。

所以两者是否真的有联系？

其实算inference time比较复杂吧，这里简单提两个要注意的点: asynchronous execution and GPU warm up

具体细节可以看下这篇文章: https://towardsdatascience.com/the-correct-way-to-measure-inference-time-of-deep-neural-networks-304a54e5187f

下面上一份pytorch算inference time的代码:

import torch
import numpy as np
import torchvision.models as models

model = models.vgg16()
device = torch.device("cuda")
model.to(device)
model.eval()
dummy_input = torch.randn(1, 3, 224, 224, dtype = torch.float).to(device)
starter, ender = torch.cuda.Event(enable_timing = True), torch.cuda.Event(enable_timing = True)
repetitions = 300
timings = np.zeros((repetitions, 1))
#GPU-WARM-UP
for _ in range(10):
   _ = model(dummy_input)
# MEASURE PERFORMANCE
with torch.no_grad():
  for rep in range(repetitions):
     starter.record()
     _ = model(dummy_input)
     ender.record()
     # WAIT FOR GPU SYNC
     torch.cuda.synchronize()
     curr_time = starter.elapsed_time(ender)
     timings[rep] = curr_time
mean_syn = np.sum(timings) / repetitions
std_syn = np.std(timings)
print(mean_syn)
print(std_syn)

请多次测量取平均值。

一般来说，模型越小代表速度越快，毕竟需要计算的量变少了很多，但也不是那么的绝对就是了