9.7.2. Extended-Precision Integer Arithmetic Instructions

Instructions add.cc, addc, sub.cc, subc, mad.cc and madc reference an implicitly specified condition code register (CC) having a single carry flag bit (CC.CF) holding carry-in/carry-out or borrow-in/borrow-out. These instructions support extended-precision integer addition, subtraction, and multiplication. No other instructions access the condition code, and there is no support for setting, clearing, or testing the condition code. The condition code register is not preserved across calls and is mainly intended for use in straight-line code sequences for computing extended-precision integer addition, subtraction, and multiplication.

The extended-precision arithmetic instructions are:

add.cc, addc
sub.cc, subc
mad.cc, madc

9.7.2.1. Extended-Precision Arithmetic Instructions: add.cc

add.cc

Add two values with carry-out.

Syntax

add.cc.type  d, a, b;

.type = { .u32, .s32, .u64, .s64 };

Description

Performs integer addition and writes the carry-out value into the condition code register.

Semantics

ptx

d = a + b;
carry-out written to CC.CF

Notes

No integer rounding modifiers.

No saturation.

Behavior is the same for unsigned and signed integers.

PTX ISA Notes

32-bit add.cc introduced in PTX ISA version 1.2.

64-bit add.cc introduced in PTX ISA version 4.3.

Target ISA Notes

32-bit add.cc is supported on all target architectures.

64-bit add.cc requires sm_20 or higher.

Examples

ptx

@p  add.cc.u32   x1,y1,z1;   // extended-precision addition of
@p  addc.cc.u32  x2,y2,z2;   // two 128-bit values
@p  addc.cc.u32  x3,y3,z3;
@p  addc.u32     x4,y4,z4;

9.7.2.2. Extended-Precision Arithmetic Instructions: addc

addc

Add two values with carry-in and optional carry-out.

Syntax

addc{.cc}.type  d, a, b;

.type = { .u32, .s32, .u64, .s64 };

Description

Performs integer addition with carry-in and optionally writes the carry-out value into the condition code register.

Semantics

ptx

d = a + b + CC.CF;
if .cc specified, carry-out written to CC.CF

Notes

No integer rounding modifiers.

No saturation.

Behavior is the same for unsigned and signed integers.

PTX ISA Notes

32-bit addc introduced in PTX ISA version 1.2.

64-bit addc introduced in PTX ISA version 4.3.

Target ISA Notes

32-bit addc is supported on all target architectures.

64-bit addc requires sm_20 or higher.

Examples

ptx

@p  add.cc.u32   x1,y1,z1;   // extended-precision addition of
@p  addc.cc.u32  x2,y2,z2;   // two 128-bit values
@p  addc.cc.u32  x3,y3,z3;
@p  addc.u32     x4,y4,z4;

9.7.2.3. Extended-Precision Arithmetic Instructions: sub.cc

sub.cc

Subtract one value from another, with borrow-out.

Syntax

sub.cc.type  d, a, b;

.type = { .u32, .s32, .u64, .s64 };

Description

Performs integer subtraction and writes the borrow-out value into the condition code register.

Semantics

ptx

d = a - b;
borrow-out written to CC.CF

Notes

No integer rounding modifiers.

No saturation.

Behavior is the same for unsigned and signed integers.

PTX ISA Notes

32-bit sub.cc introduced in PTX ISA version 1.2.

64-bit sub.cc introduced in PTX ISA version 4.3.

Target ISA Notes

32-bit sub.cc is supported on all target architectures.

64-bit sub.cc requires sm_20 or higher.

Examples

ptx

@p  sub.cc.u32   x1,y1,z1;   // extended-precision subtraction
@p  subc.cc.u32  x2,y2,z2;   // of two 128-bit values
@p  subc.cc.u32  x3,y3,z3;
@p  subc.u32     x4,y4,z4;

9.7.2.4. Extended-Precision Arithmetic Instructions: subc

subc

Subtract one value from another, with borrow-in and optional borrow-out.

Syntax

subc{.cc}.type  d, a, b;

.type = { .u32, .s32, .u64, .s64 };

Description

Performs integer subtraction with borrow-in and optionally writes the borrow-out value into the condition code register.

Semantics

ptx

d = a  - (b + CC.CF);
if .cc specified, borrow-out written to CC.CF

Notes

No integer rounding modifiers.

No saturation.

Behavior is the same for unsigned and signed integers.

PTX ISA Notes

32-bit subc introduced in PTX ISA version 1.2.

64-bit subc introduced in PTX ISA version 4.3.

Target ISA Notes

32-bit subc is supported on all target architectures.

64-bit subc requires sm_20 or higher.

Examples

ptx

@p  sub.cc.u32   x1,y1,z1;   // extended-precision subtraction
@p  subc.cc.u32  x2,y2,z2;   // of two 128-bit values
@p  subc.cc.u32  x3,y3,z3;
@p  subc.u32     x4,y4,z4;

9.7.2.5. Extended-Precision Arithmetic Instructions: mad.cc

mad.cc

Multiply two values, extract high or low half of result, and add a third value with carry-out.

Syntax

mad{.hi,.lo}.cc.type  d, a, b, c;

.type = { .u32, .s32, .u64, .s64 };

Description

Multiplies two values, extracts either the high or low part of the result, and adds a third value. Writes the result to the destination register and the carry-out from the addition into the condition code register.

Semantics

ptx

t = a * b;
d = t<63..32> + c;    // for .hi variant
d = t<31..0> + c;     // for .lo variant
carry-out from addition is written to CC.CF

Notes

Generally used in combination with madc and addc to implement extended-precision multi-word multiplication. See madc for an example.

PTX ISA Notes

32-bit mad.cc introduced in PTX ISA version 3.0.

64-bit mad.cc introduced in PTX ISA version 4.3.

Target ISA Notes

Requires target sm_20 or higher.

Examples

ptx

@p  mad.lo.cc.u32 d,a,b,c;
    mad.lo.cc.u32 r,p,q,r;

9.7.2.6. Extended-Precision Arithmetic Instructions: madc

madc

Multiply two values, extract high or low half of result, and add a third value with carry-in and optional carry-out.

Syntax

madc{.hi,.lo}{.cc}.type  d, a, b, c;

.type = { .u32, .s32, .u64, .s64 };

Description

Multiplies two values, extracts either the high or low part of the result, and adds a third value along with carry-in. Writes the result to the destination register and optionally writes the carry-out from the addition into the condition code register.

Semantics

ptx

t = a * b;
d = t<63..32> + c + CC.CF;     // for .hi variant
d = t<31..0> + c + CC.CF;      // for .lo variant
if .cc specified, carry-out from addition is written to CC.CF

Notes

Generally used in combination with mad.cc and addc to implement extended-precision multi-word multiplication. See example below.

PTX ISA Notes

32-bit madc introduced in PTX ISA version 3.0.

64-bit madc introduced in PTX ISA version 4.3.

Target ISA Notes

Requires target sm_20 or higher.

Examples

ptx

// extended-precision multiply:  [r3,r2,r1,r0] = [r5,r4] * [r7,r6]
mul.lo.u32     r0,r4,r6;      // r0=(r4*r6).[31:0], no carry-out
mul.hi.u32     r1,r4,r6;      // r1=(r4*r6).[63:32], no carry-out
mad.lo.cc.u32  r1,r5,r6,r1;   // r1+=(r5*r6).[31:0], may carry-out
madc.hi.u32    r2,r5,r6,0;    // r2 =(r5*r6).[63:32]+carry-in,
                              // no carry-out
mad.lo.cc.u32   r1,r4,r7,r1;  // r1+=(r4*r7).[31:0], may carry-out
madc.hi.cc.u32  r2,r4,r7,r2;  // r2+=(r4*r7).[63:32]+carry-in,
                              // may carry-out
addc.u32        r3,0,0;       // r3 = carry-in, no carry-out
mad.lo.cc.u32   r2,r5,r7,r2;  // r2+=(r5*r7).[31:0], may carry-out
madc.hi.u32     r3,r5,r7,r3;  // r3+=(r5*r7).[63:32]+carry-in