Permuation
The permutation part of IR APIs.
- tvm.aipu.script.ir.permutation.vconcat(x, y, part='all')
Concats
x
,y
with elements according to the value of parameterpart
.The feature Multiple Width Vector is supported.
x(i32x8): x0 x1 x2 x3 x4 x5 x6 x7 y(i32x8): y0 y1 y2 y3 y4 y5 y6 y7 out = S.vconcat(x, y) out(i32x16): x0 x1 x2 x3 x4 x5 x6 x7 y0 y1 y2 y3 y4 y5 y6 y7 out = S.vconcat(x, y, "low") out(i32x8): x0 x1 x2 x3 y0 y1 y2 y3 out = S.vconcat(x, y, "high") out(i32x8): x4 x5 x6 x7 y4 y5 y6 y7 out = S.vconcat(x, y, "even") out(i32x8): x0 x2 x4 x6 y0 y2 y4 y6 out = S.vconcat(x, y, "odd") out(i32x8): x1 x3 x5 x7 y1 y3 y5 y7
Parameters
- x, yUnion[PrimExpr, int, float]
The operands.
- partstr
Used to specify which part elements to be selected.
all: Represent all data.
low, high, even, odd: Represent the corresponding half data.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vconcat(va, vb, "low") vc = S.vconcat(va, 3, "high")
See Also
Zhouyi Compass OpenCL Programming Guide: __vextl, __vexth, __vexte, __vexto.
- tvm.aipu.script.ir.permutation.vsplit(x)
Splits
x
to multiple parts evenly according to the hardware native vector types.The feature Multiple Width Vector is supported.
x(i32x16): x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 out0, out1 = S.vsplit(x) out0(i32x8): x0 x1 x2 x3 x4 x5 x6 x7 out1(i32x8): x8 x9 x10 x11 x12 x13 x14 x15
Parameters
- xPrimExpr
The operand.
Returns
- retTuple[PrimExpr]
The result expressions.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc0, vc1 = S.vsplit(vx_fp32x16) vc0, vc1, vc2 = S.vsplit(vx_i32x24)
See Also
- tvm.aipu.script.ir.permutation.vzip(x, y, part='all')
Selects some elements from
x
andy
, rearranges in an interleave alternate sequence, and then returns. The selected elements are according to the value of parameterpart
.The mask situation where both
x
andy
are mask is also supported.The feature Flexible Width Vector is supported only when
part
isall
.The feature Multiple Width Vector is supported.
x(i32x8): x0 x1 x2 x3 x4 x5 x6 x7 y(i32x8): y0 y1 y2 y3 y4 y5 y6 y7 out = S.vzip(x, y) out(i32x16): x0 y0 x1 y1 x2 y2 x3 y3 x4 y4 x5 y5 x6 y6 x7 y7 out = S.vzip(x, y, "low") out(i32x8): x0 y0 x1 y1 x2 y2 x3 y3 out = S.vzip(x, y, "high") out(i32x8): x4 y4 x5 y5 x6 y6 x7 y7 out = S.vzip(x, y, "even") out(i32x8): x0 y0 x2 y2 x4 y4 x6 y6 out = S.vzip(x, y, "odd") out(i32x8): x1 y1 x3 y3 x5 y5 x7 y7
Parameters
- x, yUnion[PrimExpr, int, float, bool]
The operands.
- partOptional[str]
Used to specify which part elements to be selected.
all: Represent all data.
low, high, even, odd: Represent the corresponding half data.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.
Examples
vc = S.vzip(va, vb) vc = S.vzip(va, vb, "low") vc = S.vzip(va, 3) vc = S.vzip(va, 3, "high") vc = S.vzip(va, True, "even")
See Also
Zhouyi Compass OpenCL Programming Guide: __vzipl, __vziph, __vzipe, __vzipo
- tvm.aipu.script.ir.permutation.vcompt(x, mask)
Reads active elements from
x
, packs them into the lowest-numbered elements of result vector.The remaining upper elements of result vector are set to zero.
x: x0 x1 x2 x3 x4 x5 x6 x7 mask: T T F F T T T F out = S.vcompt(x, mask) out: x0 x1 x4 x5 x6 0 0 0
Parameters
- xPrimExpr
The operands.
- maskUnion[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]
The predication mask to indicate which elements of the vector are active for the operation.
None
means all elements are active.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vcompt(vx, mask="3T5F") vc = S.vcompt(vx, mask=S.tail_mask(n, 8))
See Also
Zhouyi Compass OpenCL Programming Guide: __vcompt
- tvm.aipu.script.ir.permutation.vcompc(x, y, mask)
Reads active elements from
x
, packs them into the lowest-numbered elements of result vector.The remaining upper elements of result vector are set to the lowest-numbered elements of
y
.
x: x0 x1 x2 x3 x4 x5 x6 x7 y: y0 y1 y2 y3 y4 y5 y6 y7 mask: T T F T F T F F out = S.vcompc(x, y, mask) out: x0 x1 x3 x5 y0 y1 y2 y3
Parameters
- x, yUnion[PrimExpr, int, float]
The inputs to be packed.
- maskUnion[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]
The predication mask to indicate which elements of the vector are active for the operation.
None
means all elements are active.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vcompc(va, vb, mask="3T5F") vc = S.vcompc(va, vb, mask=S.tail_mask(n, 8))
See Also
Zhouyi Compass OpenCL Programming Guide: __vcompc
- tvm.aipu.script.ir.permutation.vrevs(x)
Reverses the order of all elements in
x
.The mask situation where
x
is a mask is also supported.
x: x0 x1 x2 x3 x4 x5 x6 x7 out = S.vrevs(x) out: x7 x6 x5 x4 x3 x2 x1 x0
Parameters
- xPrimExpr
The operands.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.
Examples
vc = S.vrevs(vx) mask_revs = S.vrevs(vx > 0)
See Also
Zhouyi Compass OpenCL Programming Guide: __vrevs, __vprevs
- tvm.aipu.script.ir.permutation.vsel(x, y, mask=None)
Select active elements from
x
and inactive elements fromy
.The mask situation where both
x
andy
are mask is also supported.The feature Flexible Width Vector is supported.
The feature Multiple Width Vector is supported.
x: x0 x1 x2 x3 x4 x5 x6 x7 y: y0 y1 y2 y3 y4 y5 y6 y7 mask: T T F F T T T F out = S.vsel(x, y, mask) out: x0 x1 y2 y3 x4 x5 x6 y7
Parameters
- x, yUnion[PrimExpr, int, float]
The operands. If either one is a scalar, it will be automatically broadcast.
- maskOptional[Union[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]]
The predication mask to indicate which elements of the vector are active for the operation.
None
means all elements are active.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.
Examples
vc = S.vsel(va, vb) vc = S.vsel(va, 3) vc = S.vsel(3, vb) vc = S.vsel(va, vb, mask="3T5F") vc = S.vsel(va, vb, mask=S.tail_mask(n, 8))
See Also
Zhouyi Compass OpenCL Programming Guide: __vsel
- tvm.aipu.script.ir.permutation.vshfl(x, shift)
Performs a rotate shift by element from high to low direction in vector
x
, with the shift number of the value ofshift
.# shift direction: <---- x: x0 x1 x2 x3 x4 x5 x6 x7 shift: 2 out = S.vshfl(x, shift) out: x2 x3 x4 x5 x6 x7 x0 x1
Parameters
- xPrimExpr
The operand, should be a vector.
- shiftUnion[PrimExpr, int]
The shift value, should be a scalar.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vshfl(va, scalar_var) vc = S.vshfl(va, 3)
See Also
Zhouyi Compass OpenCL Programming Guide: __vshfl
- tvm.aipu.script.ir.permutation.vbcast(x, mask=None, lanes=None)
Performs a broadcast operation on a scalar with dtype to generate a vector.
The inactive elements of result vector are undefined.
The feature Flexible Width Vector is supported.
The feature Multiple Width Vector is supported.
x(i32): 3 mask: T T T T F F F F out = S.vbcast(S.i32(3)) out(i32x8): 3 3 3 3 ? ? ? ?
Parameters
- xPrimExpr
The operand.
- maskOptional[Union[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]]
The predication mask to indicate which elements of the vector are active for the operation.
None
means all elements are active.- lanesOptional[int]
The lanes of result vector dtype. If omitted, will be automatically determined based on the type of input value.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vbcast(S.i16(5)) vc = S.vbcast(S.fp16(1.23)) vc = S.vbcast(S.fp32(3.14159), mask="3T5F") vc = S.vbcast(S.u32(1), mask="T7F") vc = S.vbcast(S.u8(200) mask=S.tail_mask(n, 8)) vc = S.vbcast(S.i16(5), lanes=16) vc = S.vbcast(x > 0, lanes=16)
See Also
Zhouyi Compass OpenCL Programming Guide: __vbcast
- tvm.aipu.script.ir.permutation.vsldl(x, y, shift)
Shift left a unsigned immediate value shift for
x
, and pads the shift remained space withy
.# shift direction: <---- x: x0 x1 x2 x3 x4 x5 x6 x7 y: y0 y1 y2 y3 y4 y5 y6 y7 shift: 3 out = S.vsldl(x, y, shift) out: x3 x4 x5 x6 x7 y0 y1 y2
Parameters
- x, yUnion[PrimExpr, int, float]
The operands. If either one is a scalar, it will be automatically broadcast. The
x
andy
should be of the same type.- shiftint
The unsigned immediate value for the shift left operation.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vsldl(va, vb, shift=2) vc = S.vsldl(va, 3, shift=2)
See Also
Zhouyi Compass OpenCL Programming Guide: __vsldl
- tvm.aipu.script.ir.permutation.vtbl(table, indices)
Constructs
table
with 2 ~ 4 vector (a, b), (a, b, c) or (a, b, c, d), reads each element of vectorindices
as an index to select the element fromtable
, and places the indexed element in the corresponding element of result vector. If an index value is >= the element count of thetable
, then places 0 in the corresponding element of result vector.a: t0 t1 t2 t3 t4 t5 t6 t7 b: t8 t9 t10 t11 t12 t13 t14 t15 c: t16 t17 t18 t19 t20 t21 t22 t23 d: t24 t25 t26 t27 t28 t29 t30 t31 indices: 1 1 5 7 3 10 99 100 out = S.vtbl((a, b, c, d), indices) out: t1 t1 t5 t7 t3 t10 0 0
Parameters
- tableUnion[List[PrimExpr], Tuple[PrimExpr]]
A list of vector to indicate a table.
- indicesPrimExpr
The indices of table.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vout = S.vtbl((va, vb, vc, vd), index_vector) vout = S.vtbl((va, vb, vc), index_vector) vout = S.vtbl((va, vb), index_vector)
See Also
Zhouyi Compass OpenCL Programming Guide: __vtbl, __vperm
- tvm.aipu.script.ir.permutation.vreplic(x, index=0)
Uses the scalar
index
to choose one element fromx
, then replicates all elements of result vector by this element.x: x0 x1 x2 x3 x4 x5 x6 x7 index: 3 out = S.vreplic(x, index) out: x3 x3 x3 x3 x3 x3 x3 x3
Parameters
- xPrimExpr
The operand.
- indexOptional[Union[PrimExpr, int]]
The index should be a scalar.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vreplic(va, scalar_var) vc = S.vreplic(va, 3) vc = S.vreplic(va)
See Also
Zhouyi Compass OpenCL Programming Guide: __vreplic