Login
Roast topics
Find topics
Find it!
From:
siboehm.com
(Uncensored)
subscribe
How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog
http://siboehm.com/articles/22/CUDA-MMM
links
backlinks
Tagged with:
performance
gpu
cuda
deeplearning
Roast topics
Find topics
Roast it!
In this post, I’ll iteratively optimize an implementation of matrix multiplication written in CUDA.My goal is not to build a cuBLAS replacement, but to deepl...