9.7.14.1. Matrix Shape

The matrix multiply and accumulate operations support a limited set of shapes for the operand matrices A, B and C. The shapes of all three matrix operands are collectively described by the tuple MxNxK, where A is an MxK matrix, B is a KxN matrix, while C and D are MxN matrices.

The following matrix shapes are supported for the specified types:

Instruction Scale Sparsity Multiplicand Data-type Shape PTX ISA version
wmma NA Dense Floating-point - .f16 .m16n16k16, .m8n32k16, and .m32n8k16 PTX ISA version 6.0
wmma Dense Alternate floating-point format - .bf16 .m16n16k16, .m8n32k16, and .m32n8k16 PTX ISA version 7.0
wmma Dense Alternate floating-point format - .tf32 .m16n16k8 PTX ISA version 7.0
wmma Dense Integer - .u8/.s8 .m16n16k16, .m8n32k16, and .m32n8k16 PTX ISA version 6.3
wmma Dense Sub-byte integer - .u4/.s4 .m8n8k32 PTX ISA version 6.3
wmma Dense Single-bit - .b1 .m8n8k128 PTX ISA version 6.3
mma NA Dense Floating-point - .f64 .m8n8k4 PTX ISA version 7.0
mma Dense Floating-point - .f64 .m16n8k4, .m16n8k8, and .m16n8k16 PTX ISA version 7.8
mma Dense Floating-point - .f16 .m8n8k4 PTX ISA version 6.4
mma Dense Floating-point - .f16 .m16n8k8 PTX ISA version 6.5
mma Dense Floating-point - .f16 .m16n8k16 PTX ISA version 7.0
mma Dense Alternate floating-point format - .bf16 .m16n8k8 and .m16n8k16 PTX ISA version 7.0
mma Dense Alternate floating-point format - .tf32 .m16n8k4 and .m16n8k8 PTX ISA version 7.0
mma Dense Integer - .u8/.s8 .m8n8k16 PTX ISA version 6.5
mma Dense Integer - .u8/.s8 .m16n8k16 and .m16n8k32 PTX ISA version 7.0
mma Dense Sub-byte integer - .u4/.s4 .m8n8k32 PTX ISA version 6.5
mma Dense Sub-byte integer - .u4/.s4 .m16n8k32 and .m16n8k64 PTX ISA version 7.0
mma Dense Single-bit - .b1 .m8n8k128, .m16n8k128, and .m16n8k256 PTX ISA version 7.0
mma Dense Alternate floating-point format - .e4m3 / .e5m2 .m16n8k32 PTX ISA version 8.4
mma Dense Alternate floating-point format - .e4m3 / .e5m2 .m16n8k16 PTX ISA version 8.7
mma Dense Alternate floating-point format - .e3m2 / .e2m3/.e2m1 .m16n8k32 PTX ISA version 8.7
mma Yes Dense Alternate floating-point format - .e4m3 / .e5m2/.e3m2/.e2m3/.e2m1 X (Scale) .ue8m0 .m16n8k32 PTX ISA version 8.7
mma Dense Alternate floating-point format - .e2m1 X (Scale) .ue8m0/.ue4m3 .m16n8k64 PTX ISA version 8.7
mma NA Sparse Floating-point - .f16 .m16n8k16 and .m16n8k32 PTX ISA version 7.1
mma Sparse Alternate floating-point format - .bf16 .m16n8k16 and .m16n8k32 PTX ISA version 7.1
mma Sparse Alternate floating-point format - .tf32 .m16n8k8 and .m16n8k16 PTX ISA version 7.1
mma Sparse Integer - .u8/.s8 .m16n8k32 and .m16n8k64 PTX ISA version 7.1
mma Sparse Sub-byte integer - .u4/.s4 .m16n8k64 and .m16n8k128 PTX ISA version 7.1
mma Sparse Alternate floating-point format - .e4m3 / .e5m2 .m16n8k64 PTX ISA version 8.4
mma Sparse with ordered metadata Floating-point - .f16 .m16n8k16 and .m16n8k32 PTX ISA version 8.5
mma Sparse with ordered metadata Alternate floating-point format - .bf16 .m16n8k16 and .m16n8k32 PTX ISA version 8.5
mma Sparse with ordered metadata Alternate floating-point format - .tf32 .m16n8k8 and .m16n8k16 PTX ISA version 8.5
mma Sparse with ordered metadata Integer - .u8/.s8 .m16n8k32 and .m16n8k64 PTX ISA version 8.5
mma Sparse with ordered metadata Sub-byte integer - .u4/.s4 .m16n8k64 and .m16n8k128 PTX ISA version 8.5
mma Sparse with ordered metadata Alternate floating-point format - .e4m3 / .e5m2 .m16n8k64 PTX ISA version 8.5
mma Sparse with ordered metadata Alternate floating-point format - .e3m2 / .e2m3/.e2m1 .m16n8k64 PTX ISA version 8.7
mma Yes Sparse with ordered metadata Alternate floating-point format - .e4m3 / .e5m2/.e3m2/.e2m3/.e2m1 X (Scale) .ue8m0 .m16n8k64 PTX ISA version 8.7
mma Sparse with ordered metadata Alternate floating-point format - .e2m1 X (Scale) .ue8m0/.ue4m3 .m16n8k128 PTX ISA version 8.7