Git out :V d18b4d2004 Add explicit support for AArch64
Specifies register shape and suffix for floating point and vector
registers. The former cannot be benchmarked without and the latter would
require adding the required suffixes manually in the instruction
operands. Doing both allows using them in the same manner as on x86.

Additionally there are two small changes affecting all architectures:

Allow the 'w' constraint code, which is used for vector registers on
aarch64.

Always specify a clobber for the flags register as many instructions one
might want to benchmark modify it.
2020-10-20 17:43:14 +02:00
2020-10-20 17:43:14 +02:00
2018-07-24 13:12:53 +02:00
2018-09-25 10:37:53 +02:00
2020-05-27 14:11:01 +02:00
2018-09-25 10:23:40 +02:00
2020-01-10 15:26:58 +01:00
2019-07-17 17:04:46 +02:00
2018-04-27 13:47:49 +02:00

asmbench
========

A benchmark toolkit for assembly instructions using the LLVM JIT.

Usage
=====

To benchmark latency and throughput of a 64bit integer add use the following command:

``asmbench 'add {src:i64:r}, {srcdst:i64:r}'``

To benchmark two instructions interleaved use this:

``asmbench 'add {src:i64:r}, {srcdst:i64:r}' 'sub {src:i64:r}, {srcdst:i64:r}'``

To find out more add `-h` for help and `-v` for verbose mode.

Operand Templates
=================
Operands always follow this form: ``{direction:data_type:pass_type}``.

Direction may be ``src``, ``dst`` or ``srcdst``. This will allow asmbench to serialize the code (wherever possible). ``src`` operands are  read, but not modiefied by the instruction. ``dst`` operands are modified to, but not read. ``srcdst`` operands will be read and modified by the instruction.

Data and Pass Types:

* ``i64:r`` -> 64bit general purpose register (gpr) (e.g., ``%rax``)
* ``i32:r`` -> 32bit gpr (e.g., ``%ecx``)
* ``<2 x double>:x`` -> 128bit SSE register with two double precision floating-point numbers (e.g., ``%xmm1``)
* ``<4 x float>:x`` -> 128bit SSE register with four single precision floating-point numbers (e.g., ``%xmm1``)
* ``<4 x double>:x`` -> 256bit AVX register with four double precision floating-point numbers (e.g., ``%ymm1``)
* ``<8 x float>:x`` -> 256bit AVX register with eight single precision floating-point numbers (e.g., ``%ymm1``)
* ``<8 x double>:x`` -> 512bit AVX512 register with eight double precision floating-point numbers (e.g., ``%zmm1``)
* ``<16 x float>:x`` -> 512bit AVX512 register with sixteen single precision floating-point numbers (e.g., ``%zmm1``)
* ``i8:23`` -> immediate 0 (i.e., ``$23``)
Description
A Benchmark Toolkit for Assembly Instructions Using the LLVM JIT
Readme AGPL-3.0 10 MiB
Languages
Python 99.1%
C 0.9%