9.7.12. Control Flow Instructions
The following PTX instructions and syntax are for controlling execution in a PTX program:
{}@brabrx.idxcallretexit
9.7.12.1. Control Flow Instructions: {}
{}
Instruction grouping.
Syntax
{ instructionList }Description
The curly braces create a group of instructions, used primarily for defining a function body. The curly braces also provide a mechanism for determining the scope of a variable: any variable declared within a scope is not available outside the scope.
PTX ISA Notes
Introduced in PTX ISA version 1.0.
Target ISA Notes
Supported on all target architectures.
Examples
{ add.s32 a,b,c; mov.s32 d,a; }9.7.12.2. Control Flow Instructions: @
@
Predicated execution.
Syntax
@{!}p instruction;Description
Execute an instruction or instruction block for threads that have the guard predicate True. Threads with a False guard predicate do nothing.
Semantics
If {!}p then instructionPTX ISA Notes
Introduced in PTX ISA version 1.0.
Target ISA Notes
Supported on all target architectures.
Examples
setp.eq.f32 p,y,0; // is y zero?
@!p div.f32 ratio,x,y // avoid division by zero
@q bra L23; // conditional branch9.7.12.3. Control Flow Instructions: bra
bra
Branch to a target and continue execution there.
Syntax
@p bra{.uni} tgt; // tgt is a label
bra{.uni} tgt; // unconditional branchDescription
Continue execution at the target. Conditional branches are specified by using a guard predicate. The branch target must be a label.
bra.uni is guaranteed to be non-divergent, i.e. all active threads in a warp that are currently executing this instruction have identical values for the guard predicate and branch target.
Semantics
if (p) {
pc = tgt;
}PTX ISA Notes
Introduced in PTX ISA version 1.0.
Unimplemented indirect branch introduced in PTX ISA version 2.1 has been removed from the spec.
Target ISA Notes
Supported on all target architectures.
Examples
bra.uni L_exit; // uniform unconditional jump
@q bra L23; // conditional branch9.7.12.4. Control Flow Instructions: brx.idx
brx.idx
Branch to a label indexed from a list of potential branch targets.
Syntax
@p brx.idx{.uni} index, tlist;
brx.idx{.uni} index, tlist;Description
Index into a list of possible destination labels, and continue execution from the chosen label. Conditional branches are specified by using a guard predicate.
brx.idx.uni guarantees that the branch is non-divergent, i.e. all active threads in a warp that are currently executing this instruction have identical values for the guard predicate and the index argument.
The index operand is a .u32 register. The tlist operand must be the label of a .branchtargets directive. It is accessed as a zero-based sequence using index. Behaviour is undefined if the value of index is greater than or equal to the length of tlist.
The .branchtargets directive must be defined in the local function scope before it is used. It must refer to labels within the current function.
Semantics
if (p) {
if (index < length(tlist)) {
pc = tlist[index];
} else {
pc = undefined;
}
}PTX ISA Notes
Introduced in PTX ISA version 6.0.
Target ISA Notes
Requires sm_30 or higher.
Examples
.function foo () {
.reg .u32 %r0;
...
L1:
...
L2:
...
L3:
...
ts: .branchtargets L1, L2, L3;
@p brx.idx %r0, ts;
...
}9.7.12.5. Control Flow Instructions: call
call
Call a function, recording the return location.
Syntax
// direct call to named function, func is a symbol
call{.uni} (ret-param), func, (param-list);
call{.uni} func, (param-list);
call{.uni} func;
// indirect call via pointer, with full list of call targets
call{.uni} (ret-param), fptr, (param-list), flist;
call{.uni} fptr, (param-list), flist;
call{.uni} fptr, flist;
// indirect call via pointer, with no knowledge of call targets
call{.uni} (ret-param), fptr, (param-list), fproto;
call{.uni} fptr, (param-list), fproto;
call{.uni} fptr, fproto;Description
The call instruction stores the address of the next instruction, so execution can resume at that point after executing a ret instruction. A call is assumed to be divergent unless the .uni suffix is present. The .uni suffix indicates that the call is guaranteed to be non-divergent, i.e. all active threads in a warp that are currently executing this instruction have identical values for the guard predicate and call target.
For direct calls, the called location func must be a symbolic function name; for indirect calls, the called location fptr must be an address of a function held in a register. Input arguments and return values are optional. Arguments may be registers, immediate constants, or variables in .param space. Arguments are pass-by-value.
Indirect calls require an additional operand, flist or fproto, to communicate the list of potential call targets or the common function prototype of all call targets, respectively. In the first case, flist gives a complete list of potential call targets and the optimizing backend is free to optimize the calling convention. In the second case, where the complete list of potential call targets may not be known, the common function prototype is given and the call must obey the ABI's calling convention.
The flist operand is either the name of an array (call table) initialized to a list of function names; or a label associated with a .calltargets directive, which declares a list of potential call targets. In both cases the fptr register holds the address of a function listed in the call table or .calltargets list, and the call operands are type-checked against the type signature of the functions indicated by flist.
The fproto operand is the name of a label associated with a .callprototype directive. This operand is used when a complete list of potential targets is not known. The call operands are type-checked against the prototype, and code generation will follow the ABI calling convention. If a function that doesn't match the prototype is called, the behavior is undefined.
Call tables may be declared at module scope or local scope, in either the constant or global state space. The .calltargets and .callprototype directives must be declared within a function body. All functions must be declared prior to being referenced in a call table initializer or .calltargets directive.
PTX ISA Notes
Direct call introduced in PTX ISA version 1.0. Indirect call introduced in PTX ISA version 2.1.
Target ISA Notes
Direct call supported on all target architectures. Indirect call requires sm_20 or higher.
Examples
// examples of direct call
call init; // call function 'init'
call.uni g, (a); // call function 'g' with parameter 'a'
@p call (d), h, (a, b); // return value into register d
// call-via-pointer using jump table
.func (.reg .u32 rv) foo (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) bar (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) baz (.reg .u32 a, .reg .u32 b) ...
.global .u32 jmptbl[5] = { foo, bar, baz };
...
@p ld.global.u32 %r0, [jmptbl+4];
@p ld.global.u32 %r0, [jmptbl+8];
call (retval), %r0, (x, y), jmptbl;
// call-via-pointer using .calltargets directive
.func (.reg .u32 rv) foo (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) bar (.reg .u32 a, .reg .u32 b) ...
.func (.reg .u32 rv) baz (.reg .u32 a, .reg .u32 b) ...
...
@p mov.u32 %r0, foo;
@q mov.u32 %r0, baz;
Ftgt: .calltargets foo, bar, baz;
call (retval), %r0, (x, y), Ftgt;
// call-via-pointer using .callprototype directive
.func dispatch (.reg .u32 fptr, .reg .u32 idx)
{
...
Fproto: .callprototype _ (.param .u32 _, .param .u32 _);
call %fptr, (x, y), Fproto;
...9.7.12.6. Control Flow Instructions: ret
ret
Return from function to instruction after call.
Syntax
ret{.uni};Description
Return execution to caller's environment. A divergent return suspends threads until all threads are ready to return to the caller. This allows multiple divergent ret instructions.
A ret is assumed to be divergent unless the .uni suffix is present, indicating that the return is guaranteed to be non-divergent.
Any values returned from a function should be moved into the return parameter variables prior to executing the ret instruction.
A return instruction executed in a top-level entry routine will terminate thread execution.
PTX ISA Notes
Introduced in PTX ISA version 1.0.
Target ISA Notes
Supported on all target architectures.
Examples
ret;
@p ret;9.7.12.7. Control Flow Instructions: exit
exit
Terminate a thread.
Syntax
exit;Description
Ends execution of a thread.
Barriers exclusively waiting on arrivals from exited threads are always released.
PTX ISA Notes
Introduced in PTX ISA version 1.0.
Target ISA Notes
Supported on all target architectures.
Examples
exit;
@p exit;