c++ - Is it possible to speed up matrix multiplication with texture memory? -


i learning cuda.

would possible speedup simple matrix multiplication texture memory? spatial locality nice property addition tiling, overhead using texture memory outweigh it?

i can't seem find implementations of matrix multiplication use texture memory.

matrix multiply can implemented in variety of ways.

compared naive implementation of matrix multiply uses global memory, yes, it's possible speed using texture memory.

compared better-written version of matrix multiply uses shared memory, it's not texture memory give or benefit.

if want best performance cuda matrix multiply, should use cublas. don't write own matrix multiply code.


Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -