Friday, December 16, 2011

More issues but with size of blocks

So, indexing issues are not present anymore, sort of.

Now, it is something that I feel I need to understand CUDA and the GPU more than I do currently to figure it out.

Here is the deal.
I have a 1D array of size 150 corresponding to a grid of size 6x5x5. So, in my kernel, the block size is dim3(6,5,5). I am only trying to write 10.0f in every spot in the array. In this case, all the numbers are random floats.

If instead I have dim3(6,5,4), it writes 10.0f to the corresponding spots.

Why is this? The number of threads is not even greater than 512, so I don't understand why this is happening.

Any suggestion is appreciated, especially since Google has just failed me.

No comments:

Post a Comment