使用Aconda部署清华大学ChatGLM-6B过程中, 目前 cuda版本和 pytorch 版本一致, 还是显示CUDA Error: no kernel image is available for execution on the device 版本不一致, 脑壳痛, 求大神解答 ,感谢感谢
“File "F:\ChatGLM-6B-main\web_demo.py", line 7, in <module>
model = AutoModel.from_pretrained("F:\ChatGLM-6B-main\model", trust_remote_code=True).quantize(4).half().cuda()
File "C:\Users\Hasee/.cache\huggingface\modules\transformers_modules\model\modeling_chatglm.py", line 1434, in quantize
self.transformer = quantize(self.transformer, bits, empty_init=empty_init, **kwargs)
File "C:\Users\Hasee/.cache\huggingface\modules\transformers_modules\model\quantization.py", line 157, in quantize
layer.attention.query_key_value = QuantizedLinear(
File "C:\Users\Hasee/.cache\huggingface\modules\transformers_modules\model\quantization.py", line 137, in __init__
self.weight = compress_int4_weight(self.weight)
File "C:\Users\Hasee/.cache\huggingface\modules\transformers_modules\model\quantization.py", line 78, in compress_int4_weight
kernels.int4WeightCompression(
File "F:\Users\Hasee\anaconda3\envs\NewWorld\lib\site-packages\cpm_kernels\kernels\base.py", line 48, in __call__
func = self._prepare_func()
File "F:\Users\Hasee\anaconda3\envs\NewWorld\lib\site-packages\cpm_kernels\kernels\base.py", line 40, in _prepare_func self._module.get_module(), self._func_name
File "F:\Users\Hasee\anaconda3\envs\NewWorld\lib\site-packages\cpm_kernels\kernels\base.py", line 24, in get_module
self._module[curr_device] = cuda.cuModuleLoadData(self._code)
File "F:\Users\Hasee\anaconda3\envs\NewWorld\lib\site-packages\cpm_kernels\library\base.py", line 94, in wrapper
return f(*args, **kwargs)
File "F:\Users\Hasee\anaconda3\envs\NewWorld\lib\site-packages\cpm_kernels\library\cuda.py", line 233, in cuModuleLoadData
checkCUStatus(cuda.cuModuleLoadData(ctypes.byref(module), data))
File "F:\Users\Hasee\anaconda3\envs\NewWorld\lib\site-packages\cpm_kernels\library\cuda.py", line 216, in checkCUStatus
raise RuntimeError("CUDA Error: %s" % cuGetErrorString(error))
RuntimeError: CUDA Error: no kernel image is available for execution on the device”
```python
```
不是有懒人包么,直接用就好了呀,自己折腾环境,确实脑阔疼。。
我和你遇到同样的问题,都是这个错误,最后的错误停在cuda.py中checkCUStatus调用上,我感觉是cuda没有设置好造成的。
我发现你的显卡显存不够,只有2G,是无法用cuda完成推理的。
我目前已经调通cpu模式的运行,可以进行对话,但速度非常慢。
我的环境win10 64位 cuda11.8 torch2.0.1 python3.8.7