| ID | Module | Source Location | Source Function | Level | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Coverage (% app. time) | Speedup if no scalar integer | Speedup if FP arith vectorized | Speedup if fully vectorized | Speedup if FP only | Number of paths | Vectorization Ratio (%) | Vector Length Use (%) | Flops (GFLOP/s) | CQA cycles | CQA cycles if no scalar integer | CQA cycles if FP arith vectorized | CQA cycles if fully vectorized | CQA cycles if FP only |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ○Loop 18 | attention-gcc-skl512 | attention_v2.cpp:30-31 | main | Innermost | 9.41 | 9.41 | 34.01 | 1.00 | 2.58 | 14.55 | 1.00 | 1 | 20.00 | 11.25 | 1.10 | 4.00 | 4.00 | 1.55 | 0.28 | 4.00 |
| ○Loop 15 | attention-gcc-skl512 | attention_v2.cpp:30-31 | main | Innermost | 7.44 | 7.44 | 26.89 | 1.00 | 2.58 | 14.55 | 1.00 | 1 | 20.00 | 11.25 | 1.32 | 4.00 | 4.00 | 1.55 | 0.28 | 4.00 |
| ○Loop 10 | attention-gcc-skl512 | attention_v2.cpp:30-31 | main | Innermost | 2.13 | 2.13 | 7.68 | 1.00 | 2.58 | 14.55 | 1.00 | 1 | 20.00 | 11.25 | 1.18 | 4.00 | 4.00 | 1.55 | 0.28 | 4.00 |
| ○Loop 4 | attention-gcc-skl512 | attention_v2.cpp:30-31 | main | Innermost | 1.92 | 1.92 | 6.95 | 1.00 | 2.58 | 14.55 | 1.00 | 1 | 20.00 | 11.25 | 1.29 | 4.00 | 4.00 | 1.55 | 0.28 | 4.00 |
| ○Loop 7 | attention-gcc-skl512 | attention_v2.cpp:30-31 | main | Innermost | 1.87 | 1.86 | 6.74 | 1.00 | 2.58 | 14.55 | 1.00 | 1 | 20.00 | 11.25 | 1.27 | 4.00 | 4.00 | 1.55 | 0.28 | 4.00 |
| ○Loop 32 | attention-gcc-skl512 | attention_v2.cpp:55-56 | softmax(float const*, float*, float*, int) | Innermost | 1.23 | 1.23 | 4.42 | 1.00 | 2.00 | 4.00 | 1.00 | 1 | 0.00 | 6.25 | 0.30 | 3.00 | 3.00 | 1.50 | 0.75 | 3.00 |
| ○Loop 16 | attention-gcc-skl512 | attention_v2.cpp:27-30,attention_v2.cpp:33-33,attention_v2.cpp:236-236 | main | InBetween | 0.66 | 0.66 | 2.40 | 1.57 | 2.75 | 14.67 | 2.75 | 1 | 20.00 | 11.25 | 1.38 | 2.75 | 1.75 | 1.00 | 0.19 | 1.00 |
| ○Loop 2 | attention-gcc-skl512 | attention_v2.cpp:163-163,random.tcc:458-466,random.tcc:3557-3558 | main | Innermost | 0.36 | 0.36 | 1.30 | 3.08 | 1.95 | 12.03 | 4.63 | 2 | 4.00 | 10.78 | 0.08 | 9.25 | 3.00 | 4.74 | 0.77 | 2.00 |
| ○Loop 31 | attention-gcc-skl512 | attention_v2.cpp:52-53 | softmax(float const*, float*, float*, int) | Innermost | 0.17 | 0.17 | 0.61 | 1.33 | 1.33 | 16.00 | 2.00 | 1 | 0.00 | 6.25 | 1.12 | 2.00 | 1.50 | 1.50 | 0.13 | 1.00 |
| ○Loop 5 | attention-gcc-skl512 | attention_v2.cpp:27-30,attention_v2.cpp:33-33,random.tcc:422-422 | main | InBetween | 0.14 | 0.14 | 0.52 | 1.67 | 1.00 | 14.55 | 2.50 | 1 | 25.00 | 12.50 | 1.17 | 2.50 | 1.50 | 2.50 | 0.17 | 1.00 |
| ○Loop 11 | attention-gcc-skl512 | attention_v2.cpp:26-30,attention_v2.cpp:33-33 | main | InBetween | 0.14 | 0.14 | 0.51 | 1.67 | 1.00 | 14.55 | 2.50 | 1 | 25.00 | 12.50 | 0.79 | 2.50 | 1.50 | 2.50 | 0.17 | 1.00 |
| ○Loop 8 | attention-gcc-skl512 | attention_v2.cpp:26-30,attention_v2.cpp:33-33 | main | InBetween | 0.13 | 0.14 | 0.49 | 1.67 | 1.00 | 14.55 | 2.50 | 1 | 25.00 | 12.50 | 0.96 | 2.50 | 1.50 | 2.50 | 0.17 | 1.00 |
| ○Loop 19 | attention-gcc-skl512 | attention_v2.cpp:26-30,attention_v2.cpp:33-33 | main | InBetween | 0.11 | 0.11 | 0.40 | 1.67 | 1.00 | 14.55 | 2.50 | 1 | 25.00 | 12.50 | 1.45 | 2.50 | 1.50 | 2.50 | 0.17 | 1.00 |
| ○Loop 3 | attention-gcc-skl512 | attention_v2.cpp:164-167,random.tcc:406-409,random.tcc:458-459,random.tcc:462-466,random.tcc:3519-3519,random.tcc:3557-3558 | main | InBetween | 0.08 | 0.08 | 0.29 | 2.57 | 1.28 | 1.85 | 8.56 | 8 | 17.02 | 19.68 | 0.13 | 19.25 | 7.50 | 15.06 | 10.42 | 2.25 |
| ○Loop 26 | attention-gcc-skl512 | random.tcc:458-466,random.tcc:3557-3558 | main | Innermost | 0.04 | 0.04 | 0.14 | 2.67 | 2.33 | 2.39 | 5.33 | 2 | 10.00 | 13.28 | 0.75 | 8.00 | 3.00 | 3.43 | 3.35 | 1.50 |
| ○Loop 13 | attention-gcc-skl512 | attention_v2.cpp:237-238 | main | Innermost | 0.04 | 0.04 | 0.14 | 1.25 | 1.00 | 16.00 | 1.25 | 1 | 0.00 | 6.25 | 0.00 | 1.25 | 1.00 | 1.25 | 0.08 | 1.00 |
| ○Loop 42 | attention-gcc-skl512 | random.tcc:412-417 | std::mersenne_twister_engine | Single | 0.04 | 0.04 | 0.14 | 1.00 | 1.00 | 1.00 | 4.00 | 1 | 100.00 | 100.00 | 0.00 | 4.00 | 4.00 | 4.00 | 4.00 | 1.00 |
| ○Loop 35 | attention-gcc-skl512 | attention_v2.cpp:47-48 | softmax(float const*, float*, float*, int) | InBetween | 0.03 | 0.03 | 0.11 | 1.50 | 1.00 | 9.81 | 3.38 | 1 | 32.00 | 13.75 | 0.00 | 13.50 | 9.00 | 13.50 | 1.38 | 4.00 |
| ○Loop 34 | attention-gcc-skl512 | attention_v2.cpp:43-43,attention_v2.cpp:46-47,attention_v2.cpp:50-52,attention_v2.cpp:58-61 | softmax(float const*, float*, float*, int) | InBetween | 0.03 | 0.03 | 0.11 | 3.10 | 1.00 | 9.25 | 10.83 | 1 | 26.32 | 15.13 | 0.00 | 16.25 | 5.25 | 16.25 | 1.76 | 1.50 |
| ○Loop 23 | attention-gcc-skl512 | random.tcc:404-409,random.tcc:414-417,random.tcc:420-423,random.tcc:458-458,random.tcc:462-466,random.tcc:3557-3558 | main | InBetween | 0.03 | 0.02 | 0.09 | 2.02 | 2.07 | 8.16 | 5.00 | 2 | 17.02 | 15.30 | 0.00 | 13.13 | 6.50 | 6.33 | 1.61 | 2.63 |
| ○Loop 25 | attention-gcc-skl512 | random.tcc:412-417 | main | Innermost | 0.02 | 0.02 | 0.07 | 1.00 | 1.00 | 1.00 | 4.00 | 1 | 100.00 | 100.00 | 0.00 | 4.00 | 4.00 | 4.00 | 4.00 | 1.00 |
| ○Loop 41 | attention-gcc-skl512 | random.tcc:404-409 | std::mersenne_twister_engine | Single | 0.01 | 0.01 | 0.04 | 1.00 | 1.00 | 1.00 | 4.00 | 1 | 100.00 | 100.00 | 0.00 | 4.00 | 4.00 | 4.00 | 4.00 | 1.00 |
| ○Loop 24 | attention-gcc-skl512 | random.tcc:404-409 | main | Innermost | 0.01 | 0.01 | 0.02 | 1.00 | 1.00 | 1.00 | 4.00 | 1 | 100.00 | 100.00 | 0.00 | 4.00 | 4.00 | 4.00 | 4.00 | 1.00 |
| ○Loop 20 | attention-gcc-skl512 | attention_v2.cpp:26-27 | main | InBetween | 0.01 | 0.01 | 0.02 | 1.00 | 1.00 | 16.00 | 1.50 | 1 | 0.00 | 6.25 | 0.00 | 1.50 | 1.50 | 1.50 | 0.09 | 1.00 |
| ○Loop 36 | attention-gcc-skl512 | attention_v2.cpp:43-43,attention_v2.cpp:46-47 | softmax(float const*, float*, float*, int) | Outermost | 0.01 | 0.01 | 0.02 | 2.50 | 1.00 | 16.00 | NA | 1 | 25.00 | 17.19 | 0.00 | 1.25 | 0.50 | 1.25 | 0.08 | NA |
| ○Loop 33 | attention-gcc-skl512 | attention_v2.cpp:47-48 | softmax(float const*, float*, float*, int) | Innermost | 0.01 | 0.01 | 0.02 | 1.00 | 1.00 | 1.00 | 1.00 | 1 | 100.00 | 100.00 | 0.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 |
| ○Loop 1 | attention-gcc-skl512 | random.h:585-585,random.tcc:333-339 | main | Innermost | 0.01 | 0.01 | 0.02 | 1.00 | 1.00 | 8.00 | 1.00 | 1 | 0.00 | 12.50 | 0.00 | 6.00 | 6.00 | 6.00 | 0.75 | 6.00 |