Permuation

The permutation part of IR APIs.

tvm.aipu.script.ir.permutation.vconcat(x, y, part='all')

Concats x, y with elements according to the value of parameter part.

  • The feature Multiple Width Vector is supported.

   x(i32x8): x0  x1  x2  x3  x4  x5  x6  x7
   y(i32x8): y0  y1  y2  y3  y4  y5  y6  y7


out = S.vconcat(x, y)
out(i32x16): x0  x1  x2  x3  x4  x5  x6  x7  y0  y1  y2  y3  y4  y5  y6  y7

 out = S.vconcat(x, y, "low")
 out(i32x8): x0  x1  x2  x3  y0  y1  y2  y3

 out = S.vconcat(x, y, "high")
 out(i32x8): x4  x5  x6  x7  y4  y5  y6  y7

 out = S.vconcat(x, y, "even")
 out(i32x8): x0  x2  x4  x6  y0  y2  y4  y6

 out = S.vconcat(x, y, "odd")
 out(i32x8): x1  x3  x5  x7  y1  y3  y5  y7

Parameters

x, yUnion[PrimExpr, int, float]

The operands.

partstr

Used to specify which part elements to be selected.

  • all: Represent all data.

  • low, high, even, odd: Represent the corresponding half data.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vconcat(va, vb, "low")
vc = S.vconcat(va, 3, "high")

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vextl, __vexth, __vexte, __vexto.

tvm.aipu.script.ir.permutation.vsplit(x)

Splits x to multiple parts evenly according to the hardware native vector types.

  • The feature Multiple Width Vector is supported.

  x(i32x16): x0  x1   x2   x3   x4   x5   x6   x7  x8  x9  x10  x11  x12  x13  x14  x15

out0, out1 = S.vsplit(x)
out0(i32x8): x0  x1   x2   x3   x4   x5   x6   x7
out1(i32x8): x8  x9  x10  x11  x12  x13  x14  x15

Parameters

xPrimExpr

The operand.

Returns

retTuple[PrimExpr]

The result expressions.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc0, vc1 = S.vsplit(vx_fp32x16)
vc0, vc1, vc2 = S.vsplit(vx_i32x24)

See Also

tvm.aipu.script.ir.permutation.vzip(x, y, part='all')

Selects some elements from x and y, rearranges in an interleave alternate sequence, and then returns. The selected elements are according to the value of parameter part.

  • The mask situation where both x and y are mask is also supported.

  • The feature Flexible Width Vector is supported only when part is all.

  • The feature Multiple Width Vector is supported.

   x(i32x8): x0  x1  x2  x3  x4  x5  x6  x7
   y(i32x8): y0  y1  y2  y3  y4  y5  y6  y7

out = S.vzip(x, y)
out(i32x16): x0  y0  x1  y1  x2  y2  x3  y3  x4  y4  x5  y5  x6  y6  x7  y7

 out = S.vzip(x, y, "low")
 out(i32x8): x0  y0  x1  y1  x2  y2  x3  y3

 out = S.vzip(x, y, "high")
 out(i32x8): x4  y4  x5  y5  x6  y6  x7  y7

 out = S.vzip(x, y, "even")
 out(i32x8): x0  y0  x2  y2  x4  y4  x6  y6

 out = S.vzip(x, y, "odd")
 out(i32x8): x1  y1  x3  y3  x5  y5  x7  y7

Parameters

x, yUnion[PrimExpr, int, float, bool]

The operands.

partOptional[str]

Used to specify which part elements to be selected.

  • all: Represent all data.

  • low, high, even, odd: Represent the corresponding half data.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.

Examples

vc = S.vzip(va, vb)
vc = S.vzip(va, vb, "low")
vc = S.vzip(va, 3)
vc = S.vzip(va, 3, "high")
vc = S.vzip(va, True, "even")

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vzipl, __vziph, __vzipe, __vzipo

tvm.aipu.script.ir.permutation.vcompt(x, mask)

Reads active elements from x, packs them into the lowest-numbered elements of result vector.

  • The remaining upper elements of result vector are set to zero.

   x: x0  x1  x2  x3  x4  x5  x6  x7
mask:  T   T   F   F   T   T   T   F

 out = S.vcompt(x, mask)
 out: x0  x1  x4  x5  x6   0   0   0

Parameters

xPrimExpr

The operands.

maskUnion[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]

The predication mask to indicate which elements of the vector are active for the operation. None means all elements are active.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vcompt(vx, mask="3T5F")
vc = S.vcompt(vx, mask=S.tail_mask(n, 8))

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vcompt

tvm.aipu.script.ir.permutation.vcompc(x, y, mask)

Reads active elements from x, packs them into the lowest-numbered elements of result vector.

  • The remaining upper elements of result vector are set to the lowest-numbered elements of y.

   x: x0  x1  x2  x3  x4  x5  x6  x7
   y: y0  y1  y2  y3  y4  y5  y6  y7
mask:  T   T   F   T   F   T   F   F

 out = S.vcompc(x, y, mask)
 out: x0  x1  x3  x5  y0  y1  y2  y3

Parameters

x, yUnion[PrimExpr, int, float]

The inputs to be packed.

maskUnion[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]

The predication mask to indicate which elements of the vector are active for the operation. None means all elements are active.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vcompc(va, vb, mask="3T5F")
vc = S.vcompc(va, vb, mask=S.tail_mask(n, 8))

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vcompc

tvm.aipu.script.ir.permutation.vrevs(x)

Reverses the order of all elements in x.

  • The mask situation where x is a mask is also supported.

  x: x0  x1  x2  x3  x4  x5  x6  x7

out = S.vrevs(x)
out: x7  x6  x5  x4  x3  x2  x1  x0

Parameters

xPrimExpr

The operands.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.

Examples

vc = S.vrevs(vx)
mask_revs = S.vrevs(vx > 0)

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vrevs, __vprevs

tvm.aipu.script.ir.permutation.vsel(x, y, mask=None)

Select active elements from x and inactive elements from y.

  • The mask situation where both x and y are mask is also supported.

  • The feature Flexible Width Vector is supported.

  • The feature Multiple Width Vector is supported.

   x: x0  x1  x2  x3  x4  x5  x6  x7
   y: y0  y1  y2  y3  y4  y5  y6  y7
mask:  T   T   F   F   T   T   T   F

 out = S.vsel(x, y, mask)
 out: x0  x1  y2  y3  x4  x5  x6  y7

Parameters

x, yUnion[PrimExpr, int, float]

The operands. If either one is a scalar, it will be automatically broadcast.

maskOptional[Union[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]]

The predication mask to indicate which elements of the vector are active for the operation. None means all elements are active.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”, “bool8/16/32”.

Examples

vc = S.vsel(va, vb)
vc = S.vsel(va, 3)
vc = S.vsel(3, vb)
vc = S.vsel(va, vb, mask="3T5F")
vc = S.vsel(va, vb, mask=S.tail_mask(n, 8))

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vsel

tvm.aipu.script.ir.permutation.vshfl(x, shift)

Performs a rotate shift by element from high to low direction in vector x, with the shift number of the value of shift.

    # shift direction:   <----
    x: x0  x1  x2  x3  x4  x5  x6  x7
shift: 2

  out = S.vshfl(x, shift)
  out: x2  x3  x4  x5  x6  x7  x0  x1

Parameters

xPrimExpr

The operand, should be a vector.

shiftUnion[PrimExpr, int]

The shift value, should be a scalar.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vshfl(va, scalar_var)
vc = S.vshfl(va, 3)

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vshfl

tvm.aipu.script.ir.permutation.vbcast(x, mask=None, lanes=None)

Performs a broadcast operation on a scalar with dtype to generate a vector.

  • The inactive elements of result vector are undefined.

  • The feature Flexible Width Vector is supported.

  • The feature Multiple Width Vector is supported.

    x(i32): 3
      mask: T  T  T  T  F  F  F  F

out = S.vbcast(S.i32(3))
out(i32x8): 3  3  3  3  ?  ?  ?  ?

Parameters

xPrimExpr

The operand.

maskOptional[Union[Tuple[bool], List[bool], numpy.ndarray[bool], str, PrimExpr]]

The predication mask to indicate which elements of the vector are active for the operation. None means all elements are active.

lanesOptional[int]

The lanes of result vector dtype. If omitted, will be automatically determined based on the type of input value.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vbcast(S.i16(5))
vc = S.vbcast(S.fp16(1.23))
vc = S.vbcast(S.fp32(3.14159), mask="3T5F")
vc = S.vbcast(S.u32(1), mask="T7F")
vc = S.vbcast(S.u8(200) mask=S.tail_mask(n, 8))
vc = S.vbcast(S.i16(5), lanes=16)
vc = S.vbcast(x > 0, lanes=16)

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vbcast

tvm.aipu.script.ir.permutation.vsldl(x, y, shift)

Shift left a unsigned immediate value shift for x, and pads the shift remained space with y.

    # shift direction:   <----
    x: x0  x1  x2  x3  x4  x5  x6  x7
    y: y0  y1  y2  y3  y4  y5  y6  y7
shift: 3

  out = S.vsldl(x, y, shift)
  out: x3  x4  x5  x6  x7  y0  y1  y2

Parameters

x, yUnion[PrimExpr, int, float]

The operands. If either one is a scalar, it will be automatically broadcast. The x and y should be of the same type.

shiftint

The unsigned immediate value for the shift left operation.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vsldl(va, vb, shift=2)
vc = S.vsldl(va, 3, shift=2)

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vsldl

tvm.aipu.script.ir.permutation.vtbl(table, indices)

Constructs table with 2 ~ 4 vector (a, b), (a, b, c) or (a, b, c, d), reads each element of vector indices as an index to select the element from table, and places the indexed element in the corresponding element of result vector. If an index value is >= the element count of the table, then places 0 in the corresponding element of result vector.

      a: t0  t1  t2  t3  t4  t5  t6  t7
      b: t8  t9  t10 t11 t12 t13 t14 t15
      c: t16 t17 t18 t19 t20 t21 t22 t23
      d: t24 t25 t26 t27 t28 t29 t30 t31
indices: 1   1   5   7   3   10  99  100

    out = S.vtbl((a, b, c, d), indices)
    out: t1  t1  t5  t7  t3  t10 0   0

Parameters

tableUnion[List[PrimExpr], Tuple[PrimExpr]]

A list of vector to indicate a table.

indicesPrimExpr

The indices of table.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vout = S.vtbl((va, vb, vc, vd), index_vector)
vout = S.vtbl((va, vb, vc), index_vector)
vout = S.vtbl((va, vb), index_vector)

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vtbl, __vperm

tvm.aipu.script.ir.permutation.vreplic(x, index=0)

Uses the scalar index to choose one element from x, then replicates all elements of result vector by this element.

    x: x0  x1  x2  x3  x4  x5  x6  x7
index: 3

  out = S.vreplic(x, index)
  out: x3  x3  x3  x3  x3  x3  x3  x3

Parameters

xPrimExpr

The operand.

indexOptional[Union[PrimExpr, int]]

The index should be a scalar.

Returns

retPrimExpr

The result expression.

Supported DType

“int8/16/32”, “uint8/16/32”, “float16/32”.

Examples

vc = S.vreplic(va, scalar_var)
vc = S.vreplic(va, 3)
vc = S.vreplic(va)

See Also

  • Zhouyi Compass OpenCL Programming Guide: __vreplic