Could anybody please explain to me why there is a need for a special treatment of VRAM compared to a regular system RAM in this use case? Assuming, we can perform an allocation in VRAM (probably using OpenCL API), why can't we use tmpfs/ramfs code? Do I understand correctly that PCI maps VRAM to a certain memory region and it is accessible via regular CPU instructions? Is it because CPU caching is different or VRAM is uncacheable? Or is it something else?
revelation|11 years ago
This is in fact a (if not the) major limiting factor to expanded use of GPUs for general purpose calculations: you always have to copy input and results between video RAM and normal RAM.
manover|11 years ago
wtallis|11 years ago