9.7.12. Control Flow Instructions

The following PTX instructions and syntax are for controlling execution in a PTX program:

  • {}
  • @
  • bra
  • brx.idx
  • call
  • ret
  • exit

9.7.12.1. Control Flow Instructions: {}

{}

Instruction grouping.

Syntax

{ instructionList }

Description

The curly braces create a group of instructions, used primarily for defining a function body. The curly braces also provide a mechanism for determining the scope of a variable: any variable declared within a scope is not available outside the scope.

PTX ISA Notes

Introduced in PTX ISA version 1.0.

Target ISA Notes

Supported on all target architectures.

Examples

ptx
{ add.s32  a,b,c; mov.s32  d,a; }

9.7.12.2. Control Flow Instructions: @

@

Predicated execution.

Syntax

@{!}p    instruction;

Description

Execute an instruction or instruction block for threads that have the guard predicate True. Threads with a False guard predicate do nothing.

Semantics

If {!}p then instruction

PTX ISA Notes

Introduced in PTX ISA version 1.0.

Target ISA Notes

Supported on all target architectures.

Examples

ptx
    setp.eq.f32  p,y,0;     // is y zero?
@!p div.f32      ratio,x,y  // avoid division by zero

@q  bra L23;                // conditional branch

9.7.12.3. Control Flow Instructions: bra

bra

Branch to a target and continue execution there.

Syntax

@p   bra{.uni}  tgt;           // tgt is a label
     bra{.uni}  tgt;           // unconditional branch

Description

Continue execution at the target. Conditional branches are specified by using a guard predicate. The branch target must be a label.

bra.uni is guaranteed to be non-divergent, i.e. all active threads in a warp that are currently executing this instruction have identical values for the guard predicate and branch target.

Semantics

ptx
if (p) {
    pc = tgt;
}

PTX ISA Notes

Introduced in PTX ISA version 1.0.

Unimplemented indirect branch introduced in PTX ISA version 2.1 has been removed from the spec.

Target ISA Notes

Supported on all target architectures.

Examples

ptx
bra.uni  L_exit;    // uniform unconditional jump
@q  bra      L23;   // conditional branch

9.7.12.4. Control Flow Instructions: brx.idx

brx.idx

Branch to a label indexed from a list of potential branch targets.

Syntax

@p    brx.idx{.uni} index, tlist;
      brx.idx{.uni} index, tlist;

Description

Index into a list of possible destination labels, and continue execution from the chosen label. Conditional branches are specified by using a guard predicate.

brx.idx.uni guarantees that the branch is non-divergent, i.e. all active threads in a warp that are currently executing this instruction have identical values for the guard predicate and the index argument.

The index operand is a .u32 register. The tlist operand must be the label of a .branchtargets directive. It is accessed as a zero-based sequence using index. Behaviour is undefined if the value of index is greater than or equal to the length of tlist.

The .branchtargets directive must be defined in the local function scope before it is used. It must refer to labels within the current function.

Semantics

ptx
if (p) {
    if (index < length(tlist)) {
      pc = tlist[index];
    } else {
      pc = undefined;
    }
}

PTX ISA Notes

Introduced in PTX ISA version 6.0.

Target ISA Notes

Requires sm_30 or higher.

Examples

ptx
.function foo () {
    .reg .u32 %r0;
    ...
    L1:
    ...
    L2:
    ...
    L3:
    ...
    ts: .branchtargets L1, L2, L3;
    @p brx.idx %r0, ts;
    ...
}

9.7.12.5. Control Flow Instructions: call

call

Call a function, recording the return location.

Syntax

// direct call to named function, func is a symbol
call{.uni} (ret-param), func, (param-list);
call{.uni} func, (param-list);
call{.uni} func;

// indirect call via pointer, with full list of call targets
call{.uni} (ret-param), fptr, (param-list), flist;
call{.uni} fptr, (param-list), flist;
call{.uni} fptr, flist;

// indirect call via pointer, with no knowledge of call targets
call{.uni} (ret-param), fptr, (param-list), fproto;
call{.uni} fptr, (param-list), fproto;
call{.uni} fptr, fproto;

Description

The call instruction stores the address of the next instruction, so execution can resume at that point after executing a ret instruction. A call is assumed to be divergent unless the .uni suffix is present. The .uni suffix indicates that the call is guaranteed to be non-divergent, i.e. all active threads in a warp that are currently executing this instruction have identical values for the guard predicate and call target.

For direct calls, the called location func must be a symbolic function name; for indirect calls, the called location fptr must be an address of a function held in a register. Input arguments and return values are optional. Arguments may be registers, immediate constants, or variables in .param space. Arguments are pass-by-value.

Indirect calls require an additional operand, flist or fproto, to communicate the list of potential call targets or the common function prototype of all call targets, respectively. In the first case, flist gives a complete list of potential call targets and the optimizing backend is free to optimize the calling convention. In the second case, where the complete list of potential call targets may not be known, the common function prototype is given and the call must obey the ABI's calling convention.

The flist operand is either the name of an array (call table) initialized to a list of function names; or a label associated with a .calltargets directive, which declares a list of potential call targets. In both cases the fptr register holds the address of a function listed in the call table or .calltargets list, and the call operands are type-checked against the type signature of the functions indicated by flist.

The fproto operand is the name of a label associated with a .callprototype directive. This operand is used when a complete list of potential targets is not known. The call operands are type-checked against the prototype, and code generation will follow the ABI calling convention. If a function that doesn't match the prototype is called, the behavior is undefined.

Call tables may be declared at module scope or local scope, in either the constant or global state space. The .calltargets and .callprototype directives must be declared within a function body. All functions must be declared prior to being referenced in a call table initializer or .calltargets directive.

PTX ISA Notes

Direct call introduced in PTX ISA version 1.0. Indirect call introduced in PTX ISA version 2.1.

Target ISA Notes

Direct call supported on all target architectures. Indirect call requires sm_20 or higher.

Examples

ptx
// examples of direct call
    call     init;    // call function 'init'
    call.uni g, (a);  // call function 'g' with parameter 'a'
@p  call     (d), h, (a, b);  // return value into register d

// call-via-pointer using jump table
.func (.reg .u32 rv) foo (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) bar (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) baz (.reg .u32 a, .reg .u32 b) ...

.global .u32 jmptbl[5] = { foo, bar, baz };
      ...
@p    ld.global.u32  %r0, [jmptbl+4];
@p    ld.global.u32  %r0, [jmptbl+8];
      call  (retval), %r0, (x, y), jmptbl;

// call-via-pointer using .calltargets directive
.func (.reg .u32 rv) foo (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) bar (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) baz (.reg .u32 a, .reg .u32 b) ...
      ...
@p    mov.u32  %r0, foo;
@q    mov.u32  %r0, baz;
Ftgt: .calltargets foo, bar, baz;
      call  (retval), %r0, (x, y), Ftgt;

// call-via-pointer using .callprototype directive
.func dispatch (.reg .u32 fptr, .reg .u32 idx)
{
...
Fproto: .callprototype _ (.param .u32 _, .param .u32 _);
      call  %fptr, (x, y), Fproto;
...

9.7.12.6. Control Flow Instructions: ret

ret

Return from function to instruction after call.

Syntax

ret{.uni};

Description

Return execution to caller's environment. A divergent return suspends threads until all threads are ready to return to the caller. This allows multiple divergent ret instructions.

A ret is assumed to be divergent unless the .uni suffix is present, indicating that the return is guaranteed to be non-divergent.

Any values returned from a function should be moved into the return parameter variables prior to executing the ret instruction.

A return instruction executed in a top-level entry routine will terminate thread execution.

PTX ISA Notes

Introduced in PTX ISA version 1.0.

Target ISA Notes

Supported on all target architectures.

Examples

ptx
    ret;
@p  ret;

9.7.12.7. Control Flow Instructions: exit

exit

Terminate a thread.

Syntax

exit;

Description

Ends execution of a thread.

Barriers exclusively waiting on arrivals from exited threads are always released.

PTX ISA Notes

Introduced in PTX ISA version 1.0.

Target ISA Notes

Supported on all target architectures.

Examples

ptx
    exit;
@p  exit;