Compass DSL
1.2

Getting Started

  • Introduction
  • Installation
  • Build and Run Workflow
  • Tutorials

Language Basics

  • Functions
  • Variables
  • Types
  • Mask
  • Expression
  • Statement
  • Flexible/Multiple Width Vector

How-to Guides

  • How to Write a Basic Kernel
  • How to Use Subfunctions
  • How to Use a Pointer
  • How to Use DMA
  • How to Use Mask
  • How to Use RPC
  • How to Use Profiler
  • How to Debug with Python
  • How to Write the Operator Plugin
  • How to Use Macros
  • How to Use Inline Assembly
  • How to use AIFF

Explanation

  • About This Documentation
  • Zhouyi NPU Architecture
  • Compass DSL
  • Python Simulator (PySim)
  • Static and Dynamic Kernel

Reference

  • Builtin API
    • Arithmetic
    • Bitwise
    • Compare
    • Conversion
    • Math
    • Memory
    • Miscellaneous
    • Permuation
    • Synchronization
  • Script API
Compass DSL
  • »
  • Builtin API
  • View page source

Builtin API

  • Arithmetic
    • vadd()
    • vaddh()
    • vabs()
    • vsub()
    • vsubh()
    • vmul()
    • vmull()
    • vmulh()
    • vdiv()
    • vmod()
    • vdot()
    • vqdot()
    • vdpa()
    • vqdpa()
    • vrpadd()
    • vmml()
    • vmma()
    • fma()
    • vfmae()
    • vfmao()
    • vrint()
    • clip()
  • Bitwise
    • vand()
    • vor()
    • vinv()
    • vall()
    • vany()
    • vxor()
    • vnsr()
    • vnsrsr()
    • vsr()
    • vsl()
    • vror()
    • vcls()
    • clz()
    • vbrevs()
    • vpcnt()
  • Compare
    • vceq()
    • vcneq()
    • vcge()
    • vcgt()
    • vcle()
    • vclt()
    • isnan()
    • isinf()
    • max()
    • min()
    • vmaxh()
    • vminh()
    • vrpmax()
    • vrpmin()
  • Conversion
    • i16x32()
    • u16x32()
    • i32x16()
    • i32x32()
    • u32x16()
    • u32x32()
    • fp16x32()
    • int16x32()
    • uint16x32()
    • int32x16()
    • int32x32()
    • uint32x16()
    • uint32x32()
    • float16x32()
    • size_i16x32()
    • size_i32x16()
    • size_i32x32()
    • size_int16x32()
    • size_int32x16()
    • size_int32x32()
    • cast()
    • i()
    • u()
    • reinterpret()
    • vxtl()
    • vxth()
  • Math
    • exp()
    • log()
    • tanh()
    • sin()
    • cos()
    • rsqrt()
    • sqrt()
    • floor()
    • ceil()
    • ceildiv()
    • pow()
  • Memory
    • ptr()
    • match_buffer()
    • alloc()
    • alloc_buffer()
    • alloc_const()
    • vload()
    • vstore()
    • vload_gather()
    • vstore_scatter()
    • dma_copy()
    • async_dma_copy()
    • dma_transpose2d()
    • dma_upsample()
    • dma_memset()
    • flush_cache()
  • Miscellaneous
    • const_mask()
    • tail_mask()
    • get_local_size()
    • get_local_id()
    • tec_range()
    • perf_tick_begin()
    • perf_tick_end()
    • aiff()
    • async_aiff()
    • printf()
    • asm()
  • Permuation
    • vconcat()
    • vsplit()
    • vzip()
    • vcompt()
    • vcompc()
    • vrevs()
    • vsel()
    • vshfl()
    • vbcast()
    • vsldl()
    • vtbl()
    • vreplic()
  • Synchronization
    • barrier()
    • alloc_events()
    • wait_events()
    • free_events()
Previous Next

Copyright © 2024 Arm Technology (China) Co., Ltd.

Built with Sphinx using a theme provided by Read the Docs.