| Function: __svml_sexp_cout_rare_internal | Module: attention-icx-skl256 | Source: :0-0 | Coverage (incl. loops): 3.43% | (excl. loops): 3.43% |
|---|
| Function: __svml_sexp_cout_rare_internal | Module: attention-icx-skl256 | Source: :0-0 | Coverage (incl. loops): 3.43% | (excl. loops): 3.43% |
|---|
*** This Panel is Intentionally Left Blank. *** It is due to a lack of debug symbols in the given object |
0x406ee0 ENDBR64 |
0x406ee4 MOV (%RDI),%EAX |
0x406ee6 MOV %EAX,%ECX |
0x406ee8 NOT %ECX |
0x406eea MOVD %EAX,%XMM0 |
0x406eee TEST $0x7f800000,%ECX |
0x406ef4 JNE 406f13 |
0x406ef6 AND $-0x7f800001,%EAX |
0x406efb XORPS %XMM1,%XMM1 |
0x406efe CMP $-0x80000000,%EAX |
0x406f03 JE 406f0c |
0x406f05 MULSS %XMM0,%XMM0 |
0x406f09 MOVAPS %XMM0,%XMM1 |
0x406f0c XOR %EAX,%EAX |
0x406f0e MOVSS %XMM1,(%RSI) |
0x406f12 RET |
0x406f13 MOVSS 0x7109(%RIP),%XMM1 |
0x406f1b UCOMISS %XMM0,%XMM1 |
0x406f1e JAE 406f32 |
0x406f20 MOVL $0x7f800000,-0xc(%RSP) |
0x406f28 MOV $0x3,%EAX |
0x406f2d JMP 407085 |
0x406f32 UCOMISS 0x70ef(%RIP),%XMM0 |
0x406f39 JAE 406f48 |
0x406f3b MOVL $0,-0xc(%RSP) |
0x406f43 JMP 407032 |
0x406f48 MOVSS 0x70dc(%RIP),%XMM1 |
0x406f50 MULSS %XMM0,%XMM1 |
0x406f54 MOVSS %XMM1,-0x8(%RSP) |
0x406f5a MOVSS -0x8(%RSP),%XMM1 |
0x406f60 ADDSS 0x70c8(%RIP),%XMM1 |
0x406f68 MOVSS %XMM1,-0x4(%RSP) |
0x406f6e MOVSS -0x4(%RSP),%XMM1 |
0x406f74 MOVSS 0x70b8(%RIP),%XMM2 |
0x406f7c ADDSS %XMM1,%XMM2 |
0x406f80 MOVSS %XMM2,-0x8(%RSP) |
0x406f86 MOVSS -0x8(%RSP),%XMM2 |
0x406f8c MULSS 0x70a4(%RIP),%XMM2 |
0x406f94 MOVSS -0x8(%RSP),%XMM3 |
0x406f9a MULSS 0x709a(%RIP),%XMM3 |
0x406fa2 ADDSS %XMM0,%XMM2 |
0x406fa6 ADDSS %XMM3,%XMM2 |
0x406faa MOVSS 0x708e(%RIP),%XMM3 |
0x406fb2 MULSS %XMM2,%XMM3 |
0x406fb6 ADDSS 0x7086(%RIP),%XMM3 |
0x406fbe MOVD %XMM1,%EAX |
0x406fc2 MULSS %XMM2,%XMM3 |
0x406fc6 ADDSS 0x707a(%RIP),%XMM3 |
0x406fce MULSS %XMM2,%XMM3 |
0x406fd2 ADDSS 0x7072(%RIP),%XMM3 |
0x406fda MULSS %XMM2,%XMM3 |
0x406fde MOVSS 0x706a(%RIP),%XMM1 |
0x406fe6 ADDSS %XMM1,%XMM3 |
0x406fea MULSS %XMM2,%XMM3 |
0x406fee ADDSS %XMM1,%XMM3 |
0x406ff2 MOVSS %XMM3,-0xc(%RSP) |
0x406ff8 UCOMISS 0x7055(%RIP),%XMM0 |
0x406fff JAE 407039 |
0x407001 SAL $0x17,%EAX |
0x407004 ADD $0x5d800000,%EAX |
0x407009 AND $0x7f800000,%EAX |
0x40700e MOVD %EAX,%XMM0 |
0x407012 MULSS -0xc(%RSP),%XMM0 |
0x407018 MOVSS %XMM0,-0xc(%RSP) |
0x40701e MOVSS -0xc(%RSP),%XMM0 |
0x407024 MULSS 0x702c(%RIP),%XMM0 |
0x40702c MOVSS %XMM0,-0xc(%RSP) |
0x407032 MOV $0x4,%EAX |
0x407037 JMP 407085 |
0x407039 MOVSX %AX,%ECX |
0x40703c ADD $0x7f,%ECX |
0x40703f CMP $0xfe,%ECX |
0x407045 JA 407056 |
0x407047 SAL $0x17,%ECX |
0x40704a MOVD %ECX,%XMM0 |
0x40704e MULSS -0xc(%RSP),%XMM0 |
0x407054 JMP 40707d |
0x407056 SAL $0x17,%EAX |
0x407059 ADD $0x3f000000,%EAX |
0x40705e AND $0x7f800000,%EAX |
0x407063 MOVD %EAX,%XMM0 |
0x407067 MULSS -0xc(%RSP),%XMM0 |
0x40706d MOVSS %XMM0,-0xc(%RSP) |
0x407073 MOVSS -0xc(%RSP),%XMM0 |
0x407079 ADDSS %XMM0,%XMM0 |
0x40707d MOVSS %XMM0,-0xc(%RSP) |
0x407083 XOR %EAX,%EAX |
0x407085 MOVSS -0xc(%RSP),%XMM1 |
0x40708b MOVSS %XMM1,(%RSI) |
0x40708f RET |
| Coverage (%) | Name | Source Location | Module |
|---|---|---|---|
| ►51.77+ | main | attention_v2.cpp:56 | attention-icx-skl256 |
| ○ | __libc_init_first | libc.so.6 | |
| ○ | __libc_start_main | libc.so.6 | |
| ○ | _start | attention-icx-skl256 | |
| ►40.43+ | main | attention_v2.cpp:53 | attention-icx-skl256 |
| ○ | __libc_init_first | libc.so.6 | |
| ○ | __libc_start_main | libc.so.6 | |
| ○ | _start | attention-icx-skl256 | |
| ►7.80+ | __svml_expf8 | attention-icx-skl256 | |
| ○ | __libc_init_first | libc.so.6 | |
| ○ | __libc_start_main | libc.so.6 | |
| ○ | _start | attention-icx-skl256 |
| min | med | avg | max |
|---|---|---|---|
| Percentile Index | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Value |
| min | med | avg | max |
|---|---|---|---|
| Percentile Index | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Value |
| Path / |
The code analyzed by CQA in that panel excludes loops and represents 3.43% of application time for run run_0
| Source file and lines | |
| Module | attention-icx-skl256 |
| nb instructions | 90 |
| nb uops | 91 |
| loop length | 432 |
| used x86 registers | 5 |
| used mmx registers | 0 |
| used xmm registers | 4 |
| used ymm registers | 0 |
| used zmm registers | 0 |
| nb stack references | 3 |
| ADD-SUB / MUL ratio | 0.77 |
| micro-operation queue | 22.75 cycles |
| front end | 22.75 cycles |
| P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | |
|---|---|---|---|---|---|---|---|---|
| uops | 15.00 | 15.00 | 13.17 | 12.83 | 12.00 | 14.50 | 14.50 | 13.00 |
| cycles | 15.00 | 15.00 | 13.17 | 12.83 | 12.00 | 14.50 | 14.50 | 13.00 |
| Cycles executing div or sqrt instructions | NA |
| Front-end | 22.75 |
| Dispatch | 15.00 |
| Overall L1 | 22.75 |
| all | 0% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | 0% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | 0% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 0% |
| all | 4% |
| load | 0% |
| store | 0% |
| mul | 0% |
| add-sub | 0% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 40% |
| all | 3% |
| load | 0% |
| store | 0% |
| mul | 0% |
| add-sub | 0% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 12% |
| all | 6% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | 6% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | 6% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 6% |
| all | 7% |
| load | 6% |
| store | 6% |
| mul | 6% |
| add-sub | 6% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 13% |
| all | 6% |
| load | 6% |
| store | 6% |
| mul | 6% |
| add-sub | 6% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 8% |
| Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | Latency | Recip. throughput | Vectorization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ENDBR64 | N/A | |||||||||||
| MOV (%RDI),%EAX | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | N/A |
| MOV %EAX,%ECX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | scal (6.3%) |
| NOT %ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| MOVD %EAX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| TEST $0x7f800000,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| JNE 406f13 <__svml_sexp_cout_rare_internal+0x33> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| AND $-0x7f800001,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| XORPS %XMM1,%XMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | vect (25.0%) |
| CMP $-0x80000000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| JE 406f0c <__svml_sexp_cout_rare_internal+0x2c> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| MULSS %XMM0,%XMM0 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVAPS %XMM0,%XMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | vect (25.0%) |
| XOR %EAX,%EAX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | N/A |
| MOVSS %XMM1,(%RSI) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| RET | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 1 | 0.33 | 0 | 1 | N/A |
| MOVSS 0x7109(%RIP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| UCOMISS %XMM0,%XMM1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 1 | scal (6.3%) |
| JAE 406f32 <__svml_sexp_cout_rare_internal+0x52> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| MOVL $0x7f800000,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 2 | 1 | scal (6.3%) |
| MOV $0x3,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| JMP 407085 <__svml_sexp_cout_rare_internal+0x1a5> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| UCOMISS 0x70ef(%RIP),%XMM0 | 2 | 1 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 1 | scal (6.3%) |
| JAE 406f48 <__svml_sexp_cout_rare_internal+0x68> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| MOVL $0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 2 | 1 | scal (6.3%) |
| JMP 407032 <__svml_sexp_cout_rare_internal+0x152> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| MOVSS 0x70dc(%RIP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS %XMM0,%XMM1 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM1,-0x8(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0x8(%RSP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS 0x70c8(%RIP),%XMM1 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM1,-0x4(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0x4(%RSP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MOVSS 0x70b8(%RIP),%XMM2 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS %XMM1,%XMM2 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM2,-0x8(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0x8(%RSP),%XMM2 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS 0x70a4(%RIP),%XMM2 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS -0x8(%RSP),%XMM3 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS 0x709a(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS %XMM0,%XMM2 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS %XMM3,%XMM2 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS 0x708e(%RIP),%XMM3 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS 0x7086(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVD %XMM1,%EAX | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS 0x707a(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS 0x7072(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS 0x706a(%RIP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS %XMM1,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS %XMM1,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM3,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| UCOMISS 0x7055(%RIP),%XMM0 | 2 | 1 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 1 | scal (6.3%) |
| JAE 407039 <__svml_sexp_cout_rare_internal+0x159> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| SAL $0x17,%EAX | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 1 | 0.50 | N/A |
| ADD $0x5d800000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| AND $0x7f800000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| MOVD %EAX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS -0xc(%RSP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0xc(%RSP),%XMM0 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS 0x702c(%RIP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOV $0x4,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| JMP 407085 <__svml_sexp_cout_rare_internal+0x1a5> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| MOVSX %AX,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| ADD $0x7f,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| CMP $0xfe,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| JA 407056 <__svml_sexp_cout_rare_internal+0x176> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| SAL $0x17,%ECX | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 1 | 0.50 | scal (6.3%) |
| MOVD %ECX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS -0xc(%RSP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| JMP 40707d <__svml_sexp_cout_rare_internal+0x19d> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| SAL $0x17,%EAX | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 1 | 0.50 | N/A |
| ADD $0x3f000000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| AND $0x7f800000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| MOVD %EAX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS -0xc(%RSP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0xc(%RSP),%XMM0 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS %XMM0,%XMM0 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| XOR %EAX,%EAX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | N/A |
| MOVSS -0xc(%RSP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MOVSS %XMM1,(%RSI) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| RET | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 1 | 0.33 | 0 | 1 | N/A |
The code analyzed by CQA in that panel excludes loops and represents 3.43% of application time for run run_0
| Source file and lines | |
| Module | attention-icx-skl256 |
| nb instructions | 90 |
| nb uops | 91 |
| loop length | 432 |
| used x86 registers | 5 |
| used mmx registers | 0 |
| used xmm registers | 4 |
| used ymm registers | 0 |
| used zmm registers | 0 |
| nb stack references | 3 |
| ADD-SUB / MUL ratio | 0.77 |
| micro-operation queue | 22.75 cycles |
| front end | 22.75 cycles |
| P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | |
|---|---|---|---|---|---|---|---|---|
| uops | 15.00 | 15.00 | 13.17 | 12.83 | 12.00 | 14.50 | 14.50 | 13.00 |
| cycles | 15.00 | 15.00 | 13.17 | 12.83 | 12.00 | 14.50 | 14.50 | 13.00 |
| Cycles executing div or sqrt instructions | NA |
| Front-end | 22.75 |
| Dispatch | 15.00 |
| Overall L1 | 22.75 |
| all | 0% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | 0% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | 0% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 0% |
| all | 4% |
| load | 0% |
| store | 0% |
| mul | 0% |
| add-sub | 0% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 40% |
| all | 3% |
| load | 0% |
| store | 0% |
| mul | 0% |
| add-sub | 0% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 12% |
| all | 6% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | 6% |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | 6% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 6% |
| all | 7% |
| load | 6% |
| store | 6% |
| mul | 6% |
| add-sub | 6% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 13% |
| all | 6% |
| load | 6% |
| store | 6% |
| mul | 6% |
| add-sub | 6% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
| other | 8% |
| Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | Latency | Recip. throughput | Vectorization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ENDBR64 | N/A | |||||||||||
| MOV (%RDI),%EAX | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | N/A |
| MOV %EAX,%ECX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | scal (6.3%) |
| NOT %ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| MOVD %EAX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| TEST $0x7f800000,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| JNE 406f13 <__svml_sexp_cout_rare_internal+0x33> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| AND $-0x7f800001,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| XORPS %XMM1,%XMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | vect (25.0%) |
| CMP $-0x80000000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| JE 406f0c <__svml_sexp_cout_rare_internal+0x2c> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| MULSS %XMM0,%XMM0 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVAPS %XMM0,%XMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | vect (25.0%) |
| XOR %EAX,%EAX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | N/A |
| MOVSS %XMM1,(%RSI) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| RET | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 1 | 0.33 | 0 | 1 | N/A |
| MOVSS 0x7109(%RIP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| UCOMISS %XMM0,%XMM1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 1 | scal (6.3%) |
| JAE 406f32 <__svml_sexp_cout_rare_internal+0x52> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| MOVL $0x7f800000,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 2 | 1 | scal (6.3%) |
| MOV $0x3,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| JMP 407085 <__svml_sexp_cout_rare_internal+0x1a5> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| UCOMISS 0x70ef(%RIP),%XMM0 | 2 | 1 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 1 | scal (6.3%) |
| JAE 406f48 <__svml_sexp_cout_rare_internal+0x68> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| MOVL $0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 2 | 1 | scal (6.3%) |
| JMP 407032 <__svml_sexp_cout_rare_internal+0x152> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| MOVSS 0x70dc(%RIP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS %XMM0,%XMM1 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM1,-0x8(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0x8(%RSP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS 0x70c8(%RIP),%XMM1 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM1,-0x4(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0x4(%RSP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MOVSS 0x70b8(%RIP),%XMM2 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS %XMM1,%XMM2 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM2,-0x8(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0x8(%RSP),%XMM2 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS 0x70a4(%RIP),%XMM2 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS -0x8(%RSP),%XMM3 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS 0x709a(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS %XMM0,%XMM2 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS %XMM3,%XMM2 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS 0x708e(%RIP),%XMM3 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS 0x7086(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVD %XMM1,%EAX | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS 0x707a(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS 0x7072(%RIP),%XMM3 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS 0x706a(%RIP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS %XMM1,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MULSS %XMM2,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| ADDSS %XMM1,%XMM3 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM3,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| UCOMISS 0x7055(%RIP),%XMM0 | 2 | 1 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 1 | scal (6.3%) |
| JAE 407039 <__svml_sexp_cout_rare_internal+0x159> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| SAL $0x17,%EAX | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 1 | 0.50 | N/A |
| ADD $0x5d800000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| AND $0x7f800000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| MOVD %EAX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS -0xc(%RSP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0xc(%RSP),%XMM0 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MULSS 0x702c(%RIP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOV $0x4,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| JMP 407085 <__svml_sexp_cout_rare_internal+0x1a5> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| MOVSX %AX,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| ADD $0x7f,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| CMP $0xfe,%ECX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | scal (6.3%) |
| JA 407056 <__svml_sexp_cout_rare_internal+0x176> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50-1 | N/A |
| SAL $0x17,%ECX | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 1 | 0.50 | scal (6.3%) |
| MOVD %ECX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS -0xc(%RSP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| JMP 40707d <__svml_sexp_cout_rare_internal+0x19d> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1-2 | N/A |
| SAL $0x17,%EAX | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 1 | 0.50 | N/A |
| ADD $0x3f000000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| AND $0x7f800000,%EAX | 1 | 0.25 | 0.25 | 0 | 0 | 0 | 0.25 | 0.25 | 0 | 1 | 0.25 | N/A |
| MOVD %EAX,%XMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | scal (6.3%) |
| MULSS -0xc(%RSP),%XMM0 | 1 | 0.50 | 0.50 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| MOVSS -0xc(%RSP),%XMM0 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| ADDSS %XMM0,%XMM0 | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 | scal (6.3%) |
| MOVSS %XMM0,-0xc(%RSP) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| XOR %EAX,%EAX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | N/A |
| MOVSS -0xc(%RSP),%XMM1 | 1 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4-5 | 0.50 | scal (6.3%) |
| MOVSS %XMM1,(%RSI) | 1 | 0 | 0 | 0.33 | 0.33 | 1 | 0 | 0 | 0.33 | 3 | 1 | scal (6.3%) |
| RET | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 1 | 0.33 | 0 | 1 | N/A |
| Name | Coverage (%) | Time (s) |
|---|---|---|
| ○__svml_sexp_cout_rare_internal | 3.43 | 0.70 |
