![]() |
CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
#include <pitch_linear_thread_map.h>
Classes | |
| struct | Detail |
| Internal implementation details. More... | |
Public Types | |
| using | TensorCoord = layout::PitchLinearCoord |
| Tensor coordinate. More... | |
| using | Shape = Shape_ |
| Tile shape. More... | |
| using | ThreadAccessShape = cutlass::layout::PitchLinearShape< 4, 4 > |
| Access Shape of each thread. More... | |
| using | Iterations = typename platform::conditional< Threads >=Detail::ShapeVec::kContiguous, layout::PitchLinearShape< 1,(Threads >=Detail::ShapeVec::kContiguous?Detail::ShapeVec::kStrided/(kThreads/Detail::ShapeVec::kContiguous):0) >, layout::PitchLinearShape< Detail::ShapeVec::kContiguous/kThreads, Detail::ShapeVec::kStrided > >::type |
| Number of iterations by each thread. More... | |
| using | Delta = typename platform::conditional< Threads >=Detail::ShapeVec::kContiguous, layout::PitchLinearShape< Shape::kContiguous, kThreads *ThreadAccessShape::kStrided/Detail::ShapeVec::kContiguous >, layout::PitchLinearShape< kThreads *ThreadAccessShape::kContiguous, 1 > >::type |
Static Public Member Functions | |
| static CUTLASS_HOST_DEVICE TensorCoord | initial_offset (int thread_id) |
Static Public Attributes | |
| static int const | kThreads = Threads |
| Number of threads total. More... | |
| static int const | kElementsPerAccess = ThreadAccessShape::kContiguous |
| Extract length of each access from Layout. More... | |
| using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::Delta = typename platform::conditional< Threads >= Detail::ShapeVec::kContiguous, layout::PitchLinearShape< Shape::kContiguous, kThreads * ThreadAccessShape::kStrided / Detail::ShapeVec::kContiguous >, layout::PitchLinearShape< kThreads * ThreadAccessShape::kContiguous, 1 > >::type |
Interval between accesses along each dimension of the tensor's logical coordinate space (in units of Elements)
| using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::Iterations = typename platform::conditional< Threads >= Detail::ShapeVec::kContiguous, layout::PitchLinearShape< 1, (Threads >= Detail::ShapeVec::kContiguous ? Detail::ShapeVec::kStrided / (kThreads / Detail::ShapeVec::kContiguous) : 0) >, layout::PitchLinearShape< Detail::ShapeVec::kContiguous / kThreads, Detail::ShapeVec::kStrided > >::type |
| using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::Shape = Shape_ |
| using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::TensorCoord = layout::PitchLinearCoord |
| using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::ThreadAccessShape = cutlass::layout::PitchLinearShape<4, 4> |
|
inlinestatic |
Maps thread ID to a coordinate offset within the tensor's logical coordinate space (in units of Elements)
|
static |
|
static |
1.8.11