如何在 OpenCV 中使用 gpu::Stream?

how to use gpu::Stream in OpenCV?(如何在 OpenCV 中使用 gpu::Stream?)
本文介绍了如何在 OpenCV 中使用 gpu::Stream?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

OpenCV 具有封装异步调用队列的 gpu::Stream 类.某些函数具有带有附加 gpu::Stream 参数的重载.除了 gpu-basics-similarity.cpp 示例代码,OpenCV 文档中关于如何以及何时使用 gpu::Stream 的信息很少.例如,(对我来说)不是很清楚 gpu::Stream::enqueueConvertgpu::Stream::enqueueCopy 究竟是做什么的,或者如何使用 gpu::Stream 作为额外的重载参数.我正在寻找一些类似教程的 gpu::Stream 概述.

OpenCV has gpu::Stream class that encapsulates a queue of asynchronous calls. Some functions have overloads with the additional gpu::Stream parameter. Aside from gpu-basics-similarity.cpp sample code, there is very little information in OpenCV documentation on how and when to use gpu::Stream. For example, it is not very clear (to me) what exactly gpu::Stream::enqueueConvert or gpu::Stream::enqueueCopy do, or how to use gpu::Stream as additional overload parameter. I'm looking for some tutorial-like overview of gpu::Stream.

推荐答案

默认情况下所有 gpu 模块功能都是同步的,即当前 CPU 线程被阻塞,直到操作完成.

By default all gpu module functions are synchronous, i.e. current CPU thread is blocked until operation finishes.

gpu::StreamcudaStream_t 的包装器,允许使用异步非阻塞调用.CUDA异步并发执行的详细信息也可以阅读《CUDA C编程指南》.

gpu::Stream is a wrapper for cudaStream_t and allows to use asynchronous non-blocking call. You can also read "CUDA C Programming Guide" for detailed information about CUDA asynchronous concurrent execution.

大多数 gpu 模块函数都有额外的 gpu::Stream 参数.如果传递非默认流,函数调用将是异步的,调用将被添加到流命令队列中.

Most gpu module functions have additional gpu::Stream parameter. If you pass non-default stream the function call will be asynchronous, and the call will be added to stream command queue.

还有 gpu::StreamCPU<->GPUGPU<->GPU 之间的异步内存传输提供方法.但是 CPU<->GPU 异步内存传输仅适用于页面锁定的主机内存.还有一个gpu::CudaMem类封装了这种内存.

Also gpu::Stream provides methos for asynchronous memory transfers between CPU<->GPU and GPU<->GPU. But CPU<->GPU asynchronous memory transfers works only with page-locked host memory. There is another class gpu::CudaMem that encapsulates such memory.

目前,如果相同的操作将不同的数据排入不同的流两次,您可能会遇到问题.一些函数使用常量或纹理 GPU 内存,下一次调用可能会在前一次调用完成之前更新内存.但是异步调用不同的操作是安全的,因为每个操作都有自己的常量缓冲区.对您持有的缓冲区进行内存复制/上传/下载/设置操作也是安全的.

Currently, you may face problems if same operation is enqueued twice with different data to different streams. Some functions use the constant or texture GPU memory, and next call may update the memory before the previous one has been finished. But calling different operations asynchronously is safe because each operation has its own constant buffer. Memory copy/upload/download/set operations to the buffers you hold are also safe.

这是小样本:

// allocate page-locked memory
CudaMem host_src_pl(768, 1024, CV_8UC1, CudaMem::ALLOC_PAGE_LOCKED);
CudaMem host_dst_pl;

// get Mat header for CudaMem (no data copy)
Mat host_src = host_src_pl;

// fill mat on CPU
someCPUFunc(host_src);

GpuMat gpu_src, gpu_dst;

// create Stream object
Stream stream;

// next calls are non-blocking

// first upload data from host
stream.enqueueUpload(host_src_pl, gpu_src);
// perform blur
blur(gpu_src, gpu_dst, Size(5,5), Point(-1,-1), stream);
// download result back to host
stream.enqueueDownload(gpu_dst, host_dst_pl);

// call another CPU function in parallel with GPU
anotherCPUFunc();

// wait GPU for finish
stream.waitForCompletion();

// now you can use GPU results
Mat host_dst = host_dst_pl;

这篇关于如何在 OpenCV 中使用 gpu::Stream?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Rising edge interrupt triggering multiple times on STM32 Nucleo(在STM32 Nucleo上多次触发上升沿中断)
How to use va_list correctly in a sequence of wrapper functions calls?(如何在一系列包装函数调用中正确使用 va_list?)
OpenGL Perspective Projection Clipping Polygon with Vertex Outside Frustum = Wrong texture mapping?(OpenGL透视投影裁剪多边形,顶点在视锥外=错误的纹理映射?)
How does one properly deserialize a byte array back into an object in C++?(如何正确地将字节数组反序列化回 C++ 中的对象?)
What free tiniest flash file system could you advice for embedded system?(您可以为嵌入式系统推荐什么免费的最小闪存文件系统?)
Volatile member variables vs. volatile object?(易失性成员变量与易失性对象?)