![]() |
CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Functions | |
| template<typename Fragment > | |
| CUTLASS_DEVICE void | dump_fragment (Fragment const &frag, int N=0, int M=0, int S=1) |
| template<typename Element > | |
| CUTLASS_DEVICE void | dump_shmem (Element const *ptr, size_t size, int S=1) |
| CUTLASS_DEVICE void cutlass::debug::dump_fragment | ( | Fragment const & | frag, |
| int | N = 0, |
||
| int | M = 0, |
||
| int | S = 1 |
||
| ) |
The first N threads dump the first M elements from their fragments with a stride of S elements. If N is not specified, dump the data of all the threads. If M is not specified, dump all the elements of the fragment.
| CUTLASS_DEVICE void cutlass::debug::dump_shmem | ( | Element const * | ptr, |
| size_t | size, | ||
| int | S = 1 |
||
| ) |
Dump the shared memory contents. ptr is the begin address, size specifies the number of elements that need to be dumped, and S specifies the stride.
1.8.11