CUDA c图片传输时遇到问题

我想用cuda完成opencv里的mul函数功能,但是图片数据传输时遇到了问题,我尝试将图片传入再传出,不做任何操作 ,得到的结果如下图所示:

img

不知道是什么原因,代码如下

Mat Dianchen(Mat_ img1, Mat img2, Mat img3)
{
size_t memSize = img1.cols * img1.rows * sizeof(uchar);
uchar* d_src1;
uchar* d_src2;
uchar* dst;

int w = img1.cols;
int h = img1.rows;
dim3 blockSize(1024);
dim3 gridSize((w * h + blockSize.x - 1) / blockSize.x); 

cudaMalloc((void**)&d_src1, memSize);
cudaMalloc((void**)&d_src2, memSize);
cudaMalloc((void**)&dst, memSize);

cudaMemcpy(d_src1, img1.data, memSize, cudaMemcpyHostToDevice);
cudaMemcpy(d_src2, img2.data, memSize, cudaMemcpyHostToDevice);

cudaMemcpy(img3.data, d_src2, memSize, cudaMemcpyDeviceToHost);

cudaFree(d_src1);
cudaFree(d_src2);
cudaFree(dst);

return img3;

}