Function: __svml_u64div8_z0 | Module: exec | Source: :0-0 | Coverage: 17.62% |
---|
Function: __svml_u64div8_z0 | Module: exec | Source: :0-0 | Coverage: 17.62% |
---|
*** This Panel is Intentionally Left Blank. *** It is due to a lack of debug symbols in the given object |
0x451520 ENDBR64 |
0x451524 VMOVDQU64 0x25652(%RIP),%ZMM4 |
0x45152e VMOVUPD 0x25588(%RIP),%ZMM2 |
0x451538 VMOVUPD 0x256be(%RIP),%ZMM5 |
0x451542 VMOVUPD 0x255b4(%RIP),%ZMM3 |
0x45154c VPCMPEQQ 0x254ea(%RIP),%ZMM1,%K0 |
0x451556 KORTESTB %K0,%K0 |
0x45155a JE 451563 |
0x45155c MOV $0,%EAX |
0x451561 DIV %AL |
0x451563 VPANDQ %ZMM1,%ZMM4,%ZMM6 |
0x451569 VCVTUQQ2PD {rn-sae},%ZMM6,%ZMM6 |
0x45156f VPANDNQ %ZMM1,%ZMM4,%ZMM7 |
0x451575 VCVTUQQ2PD {rn-sae},%ZMM7,%ZMM7 |
0x45157b VCVTUQQ2PD {rn-sae},%ZMM1,%ZMM8 |
0x451581 VANDPD %ZMM8,%ZMM5,%ZMM5 |
0x451587 VSUBPD {rn-sae},%ZMM5,%ZMM7,%ZMM7 |
0x45158d VADDPD {rn-sae},%ZMM7,%ZMM6,%ZMM6 |
0x451593 VCVTUQQ2PD {rn-sae},%ZMM0,%ZMM7 |
0x451599 VRCP14PD %ZMM8,%ZMM9 |
0x45159f VFNMADD231PD {rn-sae},%ZMM8,%ZMM9,%ZMM2 |
0x4515a5 VFMADD132PD {rn-sae},%ZMM9,%ZMM9,%ZMM2 |
0x4515ab VMULPD {rn-sae},%ZMM2,%ZMM7,%ZMM7 |
0x4515b1 VRNDSCALEPD $0x3,{sae},%ZMM7,%ZMM7 |
0x4515b8 VANDPD %ZMM7,%ZMM3,%ZMM7 |
0x4515be VPANDNQ %ZMM0,%ZMM4,%ZMM8 |
0x4515c4 VCVTUQQ2PD {rn-sae},%ZMM8,%ZMM8 |
0x4515ca VFNMADD231PD {rn-sae},%ZMM7,%ZMM5,%ZMM8 |
0x4515d0 VPANDQ %ZMM0,%ZMM4,%ZMM4 |
0x4515d6 VCVTUQQ2PD {rn-sae},%ZMM4,%ZMM4 |
0x4515dc VFNMADD231PD {rn-sae},%ZMM7,%ZMM6,%ZMM4 |
0x4515e2 VADDPD {rn-sae},%ZMM4,%ZMM8,%ZMM4 |
0x4515e8 VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM8 |
0x4515ee VRNDSCALEPD $0x3,{sae},%ZMM8,%ZMM8 |
0x4515f5 VANDPD %ZMM8,%ZMM3,%ZMM8 |
0x4515fb VFNMADD231PD {rn-sae},%ZMM8,%ZMM5,%ZMM4 |
0x451601 VFNMADD231PD {rn-sae},%ZMM8,%ZMM6,%ZMM4 |
0x451607 VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM9 |
0x45160d VRNDSCALEPD $0x3,{sae},%ZMM9,%ZMM9 |
0x451614 VANDPD %ZMM9,%ZMM3,%ZMM3 |
0x45161a VFNMADD231PD {rn-sae},%ZMM5,%ZMM3,%ZMM4 |
0x451620 VFNMADD231PD {rn-sae},%ZMM6,%ZMM3,%ZMM4 |
0x451626 VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM2 |
0x45162c VCVTPD2UQQ {rz-sae},%ZMM3,%ZMM3 |
0x451632 VCVTPD2UQQ {rz-sae},%ZMM8,%ZMM4 |
0x451638 VCVTPD2UQQ {rz-sae},%ZMM7,%ZMM5 |
0x45163e VPADDQ %ZMM4,%ZMM5,%ZMM4 |
0x451644 VCVTPD2UQQ {rz-sae},%ZMM2,%ZMM2 |
0x45164a VPADDD %ZMM3,%ZMM2,%ZMM2 |
0x451650 VPADDQ %ZMM2,%ZMM4,%ZMM2 |
0x451656 VPMULLQ %ZMM1,%ZMM2,%ZMM3 |
0x45165c VPSUBQ %ZMM3,%ZMM0,%ZMM0 |
0x451662 VPCMPNLTUQ %ZMM1,%ZMM0,%K1 |
0x451669 VPADDQ 0x254cd(%RIP),%ZMM2,%ZMM2{%K1} |
0x451673 VMOVDQA64 %ZMM2,%ZMM0 |
0x451679 RET |
0x45167a XCHG %AX,%AX |
0x45167c NOPL (%RAX) |
Path / |
Source file and lines | |
Module | exec |
nb instructions | 55 |
nb uops | 64.50 |
loop length | 342.50 |
used x86 registers | 0.50 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 10 |
nb stack references | 0 |
ADD-SUB / MUL ratio | 0.75 |
micro-operation queue | 10.75 cycles |
front end | 10.75 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 26.00 | 1.50 | 2.33 | 2.33 | 0.00 | 26.00 | 2.00 | 0.00 | 0.00 | 0.00 | 0.50 | 2.33 |
cycles | 26.00 | 7.50 | 2.33 | 2.33 | 0.00 | 26.00 | 2.00 | 0.00 | 0.00 | 0.00 | 0.50 | 2.33 |
Cycles executing div or sqrt instructions | 3.00 |
FE+BE cycles | 103.49-103.54 |
Stall cycles | 92.55-92.60 |
RS full (events) | 102.95-103.00 |
Front-end | 10.75 |
Dispatch | 26.00 |
DIV/SQRT | 3.00 |
Overall L1 | 26.00 |
all | 97% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | NA (no fma vectorizable/vectorized instructions) |
other | 96% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 99% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 98% |
all | 97% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | NA (no fma vectorizable/vectorized instructions) |
other | 96% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 99% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 98% |
Source file and lines | |
Module | exec |
nb instructions | 56 |
nb uops | 67 |
loop length | 346 |
used x86 registers | 1 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 10 |
nb stack references | 0 |
ADD-SUB / MUL ratio | 0.75 |
micro-operation queue | 11.17 cycles |
front end | 11.17 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 26.00 | 3.00 | 2.33 | 2.33 | 0.00 | 26.00 | 2.00 | 0.00 | 0.00 | 0.00 | 1.00 | 2.33 |
cycles | 26.00 | 7.50 | 2.33 | 2.33 | 0.00 | 26.00 | 2.00 | 0.00 | 0.00 | 0.00 | 1.00 | 2.33 |
Cycles executing div or sqrt instructions | 6.00 |
FE+BE cycles | 103.55-103.65 |
Stall cycles | 92.25-92.35 |
RS full (events) | 102.99-103.09 |
Front-end | 11.17 |
Dispatch | 26.00 |
DIV/SQRT | 6.00 |
Overall L1 | 26.00 |
all | 95% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | NA (no fma vectorizable/vectorized instructions) |
other | 92% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 98% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 96% |
all | 95% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | NA (no fma vectorizable/vectorized instructions) |
other | 93% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 98% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 96% |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ENDBR64 | |||||||||||||||
VMOVDQU64 0x25652(%RIP),%ZMM4 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VMOVUPD 0x25588(%RIP),%ZMM2 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VMOVUPD 0x256be(%RIP),%ZMM5 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VMOVUPD 0x255b4(%RIP),%ZMM3 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VPCMPEQQ 0x254ea(%RIP),%ZMM1,%K0 | 2 | 0 | 0 | 0.33 | 0.33 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0.33 | 3 | 1 |
KORTESTB %K0,%K0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
JE 451563 <__svml_u64div8_z0+0x43> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 |
MOV $0,%EAX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
DIV %AL | 4 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11-16 | 6 |
VPANDQ %ZMM1,%ZMM4,%ZMM6 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM6,%ZMM6 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPANDNQ %ZMM1,%ZMM4,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM7,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM1,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VANDPD %ZMM8,%ZMM5,%ZMM5 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VSUBPD {rn-sae},%ZMM5,%ZMM7,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0.50 |
VADDPD {rn-sae},%ZMM7,%ZMM6,%ZMM6 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM0,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRCP14PD %ZMM8,%ZMM9 | 3 | 2.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 2 |
VFNMADD231PD {rn-sae},%ZMM8,%ZMM9,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFMADD132PD {rn-sae},%ZMM9,%ZMM9,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM7,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRNDSCALEPD $0x3,{sae},%ZMM7,%ZMM7 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 1 |
VANDPD %ZMM7,%ZMM3,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VPANDNQ %ZMM0,%ZMM4,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM8,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM7,%ZMM5,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPANDQ %ZMM0,%ZMM4,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM4,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM7,%ZMM6,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VADDPD {rn-sae},%ZMM4,%ZMM8,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRNDSCALEPD $0x3,{sae},%ZMM8,%ZMM8 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 1 |
VANDPD %ZMM8,%ZMM3,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM8,%ZMM5,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM8,%ZMM6,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM9 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRNDSCALEPD $0x3,{sae},%ZMM9,%ZMM9 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 1 |
VANDPD %ZMM9,%ZMM3,%ZMM3 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM5,%ZMM3,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM6,%ZMM3,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM3,%ZMM3 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM8,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM7,%ZMM5 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPADDQ %ZMM4,%ZMM5,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM2,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPADDD %ZMM3,%ZMM2,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VPADDQ %ZMM2,%ZMM4,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VPMULLQ %ZMM1,%ZMM2,%ZMM3 | 5 | 1.50 | 0 | 0 | 0 | 0 | 1.50 | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 1.50 |
VPSUBQ %ZMM3,%ZMM0,%ZMM0 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0-1 | 0.50 |
VPCMPNLTUQ %ZMM1,%ZMM0,%K1 | |||||||||||||||
VPADDQ 0x254cd(%RIP),%ZMM2,%ZMM2{%K1} | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1 | 0.67 |
VMOVDQA64 %ZMM2,%ZMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0-1 | 0.17 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
Source file and lines | |
Module | exec |
nb instructions | 54 |
nb uops | 62 |
loop length | 339 |
used x86 registers | 0 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 10 |
nb stack references | 0 |
ADD-SUB / MUL ratio | 0.75 |
micro-operation queue | 10.33 cycles |
front end | 10.33 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 26.00 | 0.00 | 2.33 | 2.33 | 0.00 | 26.00 | 2.00 | 0.00 | 0.00 | 0.00 | 0.00 | 2.33 |
cycles | 26.00 | 7.50 | 2.33 | 2.33 | 0.00 | 26.00 | 2.00 | 0.00 | 0.00 | 0.00 | 0.00 | 2.33 |
Cycles executing div or sqrt instructions | NA |
FE+BE cycles | 103.43 |
Stall cycles | 92.84 |
RS full (events) | 102.92 |
Front-end | 10.33 |
Dispatch | 26.00 |
Overall L1 | 26.00 |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | NA (no fma vectorizable/vectorized instructions) |
other | 100% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | NA (no fma vectorizable/vectorized instructions) |
other | 100% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
all | 100% |
load | 100% |
store | NA (no store vectorizable/vectorized instructions) |
mul | 100% |
add-sub | 100% |
fma | 100% |
div/sqrt | 100% |
other | 100% |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ENDBR64 | |||||||||||||||
VMOVDQU64 0x25652(%RIP),%ZMM4 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VMOVUPD 0x25588(%RIP),%ZMM2 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VMOVUPD 0x256be(%RIP),%ZMM5 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VMOVUPD 0x255b4(%RIP),%ZMM3 | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0-1 | 0.50 |
VPCMPEQQ 0x254ea(%RIP),%ZMM1,%K0 | 2 | 0 | 0 | 0.33 | 0.33 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0.33 | 3 | 1 |
KORTESTB %K0,%K0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
JE 451563 <__svml_u64div8_z0+0x43> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 |
VPANDQ %ZMM1,%ZMM4,%ZMM6 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM6,%ZMM6 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPANDNQ %ZMM1,%ZMM4,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM7,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM1,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VANDPD %ZMM8,%ZMM5,%ZMM5 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VSUBPD {rn-sae},%ZMM5,%ZMM7,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0.50 |
VADDPD {rn-sae},%ZMM7,%ZMM6,%ZMM6 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM0,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRCP14PD %ZMM8,%ZMM9 | 3 | 2.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 2 |
VFNMADD231PD {rn-sae},%ZMM8,%ZMM9,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFMADD132PD {rn-sae},%ZMM9,%ZMM9,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM7,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRNDSCALEPD $0x3,{sae},%ZMM7,%ZMM7 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 1 |
VANDPD %ZMM7,%ZMM3,%ZMM7 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VPANDNQ %ZMM0,%ZMM4,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM8,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM7,%ZMM5,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPANDQ %ZMM0,%ZMM4,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTUQQ2PD {rn-sae},%ZMM4,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM7,%ZMM6,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VADDPD {rn-sae},%ZMM4,%ZMM8,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRNDSCALEPD $0x3,{sae},%ZMM8,%ZMM8 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 1 |
VANDPD %ZMM8,%ZMM3,%ZMM8 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM8,%ZMM5,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM8,%ZMM6,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM9 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VRNDSCALEPD $0x3,{sae},%ZMM9,%ZMM9 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 1 |
VANDPD %ZMM9,%ZMM3,%ZMM3 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM5,%ZMM3,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VFNMADD231PD {rn-sae},%ZMM6,%ZMM3,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VMULPD {rn-sae},%ZMM2,%ZMM4,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM3,%ZMM3 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM8,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM7,%ZMM5 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPADDQ %ZMM4,%ZMM5,%ZMM4 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VCVTPD2UQQ {rz-sae},%ZMM2,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
VPADDD %ZMM3,%ZMM2,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VPADDQ %ZMM2,%ZMM4,%ZMM2 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
VPMULLQ %ZMM1,%ZMM2,%ZMM3 | 5 | 1.50 | 0 | 0 | 0 | 0 | 1.50 | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 1.50 |
VPSUBQ %ZMM3,%ZMM0,%ZMM0 | 1 | 0.50 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0-1 | 0.50 |
VPCMPNLTUQ %ZMM1,%ZMM0,%K1 | |||||||||||||||
VPADDQ 0x254cd(%RIP),%ZMM2,%ZMM2{%K1} | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1 | 0.67 |
VMOVDQA64 %ZMM2,%ZMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0-1 | 0.17 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
Name | Coverage (%) | Time (s) |
---|---|---|
○__svml_u64div8_z0 | 17.62 | 15.67 |