- 20 Sep, 2018 3 commits
-
-
Daniel Krebs authored
-
Daniel Krebs authored
-
Daniel Krebs authored
when chunk size was a already multiple of the page size, the resulting size would be one page too much
-
- 20 Jul, 2018 1 commit
-
-
Daniel Krebs authored
This was neccessary in order to make the memory available via GDRcopy when multiple small allocations were made. cudaMalloc() would return multiple memory chunks located in the same GPU page, which GDRcopy pretty much dislikes (`gdrdrv:offset != 0 is not supported`). As a side effect, this will keep the number of BAR-mappings done via GDRcopy low, because they seem to be quite limited.
-
- 06 Jun, 2018 1 commit
-
-
Daniel Krebs authored
-
- 15 May, 2018 2 commits
-
-
Daniel Krebs authored
Using CUDA, memory can be allocated on the GPU and shared to peers on the PCIe bus such as the FPGA. Furthermore, the DMA on the GPU can also be used to read and write to/from other memory on the PCIe bus, such as BRAM on the FPGA.
-
Daniel Krebs authored
-