Permuation
The permutation part of IR APIs.
- tvm.aipu.script.ir.permutation.vconcat(x, y, part='all')
Concats
x,ywith elements according to the value of parameterpart.The feature Multiple Width Vector is supported.
x(i32x8): x0 x1 x2 x3 x4 x5 x6 x7 y(i32x8): y0 y1 y2 y3 y4 y5 y6 y7 out = S.vconcat(x, y) out(i32x16): x0 x1 x2 x3 x4 x5 x6 x7 y0 y1 y2 y3 y4 y5 y6 y7 out = S.vconcat(x, y, "low") out(i32x8): x0 x1 x2 x3 y0 y1 y2 y3 out = S.vconcat(x, y, "high") out(i32x8): x4 x5 x6 x7 y4 y5 y6 y7 out = S.vconcat(x, y, "even") out(i32x8): x0 x2 x4 x6 y0 y2 y4 y6 out = S.vconcat(x, y, "odd") out(i32x8): x1 x3 x5 x7 y1 y3 y5 y7
Parameters
- x, yUnion[PrimExpr, int, float]
The operands.
- partstr
Used to specify which part elements to be selected.
all: Represent all data.
low, high, even, odd: Represent the corresponding half data.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vconcat(va, vb, "low") vc = S.vconcat(va, 3, "high")
See Also
Zhouyi Compass OpenCL Programming Guide: __vextl, __vexth, __vexte, __vexto.
- tvm.aipu.script.ir.permutation.vsplit(x)
Splits
xto multiple parts evenly according to the hardware native vector types.The feature Multiple Width Vector is supported.
x(i32x16): x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 out0, out1 = S.vsplit(x) out0(i32x8): x0 x1 x2 x3 x4 x5 x6 x7 out1(i32x8): x8 x9 x10 x11 x12 x13 x14 x15
Parameters
- xPrimExpr
The operand.
Returns
- retTuple[PrimExpr]
The result expressions.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc0, vc1 = S.vsplit(vx_fp32x16) vc0, vc1, vc2 = S.vsplit(vx_i32x24)
See Also
- tvm.aipu.script.ir.permutation.vzip(x, y, part='all')
Selects some elements from
xandy, rearranges in an interleave alternate sequence, and then returns. The selected elements are according to the value of parameterpart.The mask situation where both
xandyare mask is also supported.The feature Flexible Width Vector is supported only when
partisall.The feature Multiple Width Vector is supported.
x(i32x8): x0 x1 x2 x3 x4 x5 x6 x7 y(i32x8): y0 y1 y2 y3 y4 y5 y6 y7 out = S.vzip(x, y) out(i32x16): x0 y0 x1 y1 x2 y2 x3 y3 x4 y4 x5 y5 x6 y6 x7 y7 out = S.vzip(x, y, "low") out(i32x8): x0 y0 x1 y1 x2 y2 x3 y3 out = S.vzip(x, y, "high") out(i32x8): x4 y4 x5 y5 x6 y6 x7 y7 out = S.vzip(x, y, "even") out(i32x8): x0 y0 x2 y2 x4 y4 x6 y6 out = S.vzip(x, y, "odd") out(i32x8): x1 y1 x3 y3 x5 y5 x7 y7
Parameters
- x, yUnion[PrimExpr, int, float, bool]
The operands.
- partOptional[str]
Used to specify which part elements to be selected.
all: Represent all data.
low, high, even, odd: Represent the corresponding half data.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.
Examples
vc = S.vzip(va, vb) vc = S.vzip(va, vb, "low") vc = S.vzip(va, 3) vc = S.vzip(va, 3, "high") vc = S.vzip(va, True, "even")
See Also
Zhouyi Compass OpenCL Programming Guide: __vzipl, __vziph, __vzipe, __vzipo
- tvm.aipu.script.ir.permutation.vcompt(x, mask)
Reads active elements from
x, packs them into the lowest-numbered elements of result vector.The remaining upper elements of result vector are set to zero.
x: x0 x1 x2 x3 x4 x5 x6 x7 mask: T T F F T T T F out = S.vcompt(x, mask) out: x0 x1 x4 x5 x6 0 0 0
Parameters
- xPrimExpr
The operands.
- maskUnion[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]
The predication mask to indicate which elements of the vector are active for the operation.
Nonemeans all elements are active.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vcompt(vx, mask="3T5F") vc = S.vcompt(vx, mask=S.tail_mask(n, 8))
See Also
Zhouyi Compass OpenCL Programming Guide: __vcompt
- tvm.aipu.script.ir.permutation.vcompc(x, y, mask)
Reads active elements from
x, packs them into the lowest-numbered elements of result vector.The remaining upper elements of result vector are set to the lowest-numbered elements of
y.
x: x0 x1 x2 x3 x4 x5 x6 x7 y: y0 y1 y2 y3 y4 y5 y6 y7 mask: T T F T F T F F out = S.vcompc(x, y, mask) out: x0 x1 x3 x5 y0 y1 y2 y3
Parameters
- x, yUnion[PrimExpr, int, float]
The inputs to be packed.
- maskUnion[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]
The predication mask to indicate which elements of the vector are active for the operation.
Nonemeans all elements are active.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vcompc(va, vb, mask="3T5F") vc = S.vcompc(va, vb, mask=S.tail_mask(n, 8))
See Also
Zhouyi Compass OpenCL Programming Guide: __vcompc
- tvm.aipu.script.ir.permutation.vrevs(x)
Reverses the order of all elements in
x.The mask situation where
xis a mask is also supported.
x: x0 x1 x2 x3 x4 x5 x6 x7 out = S.vrevs(x) out: x7 x6 x5 x4 x3 x2 x1 x0
Parameters
- xPrimExpr
The operands.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.
Examples
vc = S.vrevs(vx) mask_revs = S.vrevs(vx > 0)
See Also
Zhouyi Compass OpenCL Programming Guide: __vrevs, __vprevs
- tvm.aipu.script.ir.permutation.vsel(x, y, mask=None)
Select active elements from
xand inactive elements fromy.The mask situation where both
xandyare mask is also supported.The feature Flexible Width Vector is supported.
The feature Multiple Width Vector is supported.
x: x0 x1 x2 x3 x4 x5 x6 x7 y: y0 y1 y2 y3 y4 y5 y6 y7 mask: T T F F T T T F out = S.vsel(x, y, mask) out: x0 x1 y2 y3 x4 x5 x6 y7
Parameters
- x, yUnion[PrimExpr, int, float]
The operands. If either one is a scalar, it will be automatically broadcast.
- maskOptional[Union[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]]
The predication mask to indicate which elements of the vector are active for the operation.
Nonemeans all elements are active.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.
Examples
vc = S.vsel(va, vb) vc = S.vsel(va, 3) vc = S.vsel(3, vb) vc = S.vsel(va, vb, mask="3T5F") vc = S.vsel(va, vb, mask=S.tail_mask(n, 8))
See Also
Zhouyi Compass OpenCL Programming Guide: __vsel
- tvm.aipu.script.ir.permutation.vshfl(x, shift)
Performs a rotate shift by element from high to low direction in vector
x, with the shift number of the value ofshift.# shift direction: <---- x: x0 x1 x2 x3 x4 x5 x6 x7 shift: 2 out = S.vshfl(x, shift) out: x2 x3 x4 x5 x6 x7 x0 x1
Parameters
- xPrimExpr
The operand, should be a vector.
- shiftUnion[PrimExpr, int]
The shift value, should be a scalar.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vshfl(va, scalar_var) vc = S.vshfl(va, 3)
See Also
Zhouyi Compass OpenCL Programming Guide: __vshfl
- tvm.aipu.script.ir.permutation.vbcast(x, mask=None, lanes=None)
Performs a broadcast operation on a scalar with dtype to generate a vector.
The inactive elements of result vector are undefined.
The feature Flexible Width Vector is supported.
The feature Multiple Width Vector is supported.
x(i32): 3 mask: T T T T F F F F out = S.vbcast(S.i32(3)) out(i32x8): 3 3 3 3 ? ? ? ?Parameters
- xPrimExpr
The operand.
- maskOptional[Union[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]]
The predication mask to indicate which elements of the vector are active for the operation.
Nonemeans all elements are active.- lanesOptional[int]
The lanes of result vector dtype. If omitted, will be automatically determined based on the type of input value.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vbcast(S.i16(5)) vc = S.vbcast(S.fp16(1.23)) vc = S.vbcast(S.fp32(3.14159), mask="3T5F") vc = S.vbcast(S.u32(1), mask="T7F") vc = S.vbcast(S.u8(200) mask=S.tail_mask(n, 8)) vc = S.vbcast(S.i16(5), lanes=16) vc = S.vbcast(x > 0, lanes=16)
See Also
Zhouyi Compass OpenCL Programming Guide: __vbcast
- tvm.aipu.script.ir.permutation.vsldl(x, y, shift)
Shift left a unsigned immediate value shift for
x, and pads the shift remained space withy.# shift direction: <---- x: x0 x1 x2 x3 x4 x5 x6 x7 y: y0 y1 y2 y3 y4 y5 y6 y7 shift: 3 out = S.vsldl(x, y, shift) out: x3 x4 x5 x6 x7 y0 y1 y2
Parameters
- x, yUnion[PrimExpr, int, float]
The operands. If either one is a scalar, it will be automatically broadcast. The
xandyshould be of the same type.- shiftint
The unsigned immediate value for the shift left operation.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vsldl(va, vb, shift=2) vc = S.vsldl(va, 3, shift=2)
See Also
Zhouyi Compass OpenCL Programming Guide: __vsldl
- tvm.aipu.script.ir.permutation.vtbl(table, indices)
Constructs
tablewith 2 ~ 4 vector (a, b), (a, b, c) or (a, b, c, d), reads each element of vectorindicesas an index to select the element fromtable, and places the indexed element in the corresponding element of result vector. If an index value is >= the element count of thetable, then places 0 in the corresponding element of result vector.a: t0 t1 t2 t3 t4 t5 t6 t7 b: t8 t9 t10 t11 t12 t13 t14 t15 c: t16 t17 t18 t19 t20 t21 t22 t23 d: t24 t25 t26 t27 t28 t29 t30 t31 indices: 1 1 5 7 3 10 99 100 out = S.vtbl((a, b, c, d), indices) out: t1 t1 t5 t7 t3 t10 0 0
Parameters
- tableUnion[List[PrimExpr], Tuple[PrimExpr]]
A list of vector to indicate a table.
- indicesPrimExpr
The indices of table.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vout = S.vtbl((va, vb, vc, vd), index_vector) vout = S.vtbl((va, vb, vc), index_vector) vout = S.vtbl((va, vb), index_vector)
See Also
Zhouyi Compass OpenCL Programming Guide: __vtbl, __vperm
- tvm.aipu.script.ir.permutation.vreplic(x, index=0)
Uses the scalar
indexto choose one element fromx, then replicates all elements of result vector by this element.x: x0 x1 x2 x3 x4 x5 x6 x7 index: 3 out = S.vreplic(x, index) out: x3 x3 x3 x3 x3 x3 x3 x3
Parameters
- xPrimExpr
The operand.
- indexOptional[Union[PrimExpr, int]]
The index should be a scalar.
Returns
- retPrimExpr
The result expression.
Supported DType
“int8/16/32”, “uint8/16/32”, “float16/32”.
Examples
vc = S.vreplic(va, scalar_var) vc = S.vreplic(va, 3) vc = S.vreplic(va)
See Also
Zhouyi Compass OpenCL Programming Guide: __vreplic