| Run gcc | Run armclang |
| | | |
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| -1 | 0.06 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.50 | -1 | 0.06 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.50 |
| 443 | 0.58 | 0.05 | 0.05 | 1 | 0.00 | 0.00 | 4.98 | 338 | 0.43 | 0.04 | 0.04 | 1 | 0.00 | 0.00 | 11.79 |
| 338 | 6.64 | 0.57 | 0.57 | 1 | 0.00 | 0.00 | 5.73 | -1 | 0.12 | 0.01 | 0.01 | 1 | 0.00 | 0.00 | 23.63 |
| 1126 | 0.06 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1299 | 0.37 | 0.03 | 0.03 | 1 | 0.00 | 0.00 | 2.29 |
| 1299 | 0.29 | 0.02 | 0.03 | 1 | 0.00 | 0.00 | 1.40 | 1128 | 0.06 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| -1 | 0.29 | 0.02 | 0.03 | 1 | 0.00 | 0.00 | 6.10 | 1126 | 0.06 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| -1 | 0.06 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 248 | 3.06 | 0.25 | 0.25 | 1 | 0.00 | 0.00 | 15.04 |
| Run gcc | Run armclang |
| - /home/eoseret/llm-attention/attention_v2.cpp: 42-63
| | |
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 10 | 0.87 | 0.08 | 0.08 | 1 | 0.00 | 0.00 | 4.80 | |
| Run gcc | Run armclang |
| - /usr/include/c++/14/bits/random.tcc: 404-425
| | |
| ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 12 | 0.29 | 0.02 | 0.03 | 1 | 0.00 | 0.00 | 0.15 | |
| Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
| gcc | armclang | gcc | armclang | gcc | armclang | gcc | armclang | gcc | armclang | gcc | armclang | gcc | armclang |
| main | binary | 90.82 | 95.84 | 7.86 | 7.84 | 7.86 | 7.84 | 1 | 1 | 7.46 | 7.59 | 0.00 | 0.00 | 0.00 | 0.00 |
| __expf_finite | libm.so.6 | 6.64 | 0.43 | 0.57 | 0.04 | 0.57 | 0.04 | 1 | 1 | 5.73 | 11.79 | 0.00 | 0.00 | 0.00 | 0.00 |
| _ZGVnN4v_expf | libamath.so | NA | 3.06 | NA | 0.25 | NA | 0.25 | NA | 1 | NA | 15.04 | NA | 0.00 | NA | 0.00 |
| softmax(float const*, float*, float*, int) | binary | 0.87 | NA | 0.08 | NA | 0.08 | NA | 1 | NA | 4.80 | NA | 0.00 | NA | 0.00 | NA |
| __GI___memset_generic | libc.so.6 | 0.29 | 0.37 | 0.02 | 0.03 | 0.03 | 0.03 | 1 | 1 | 1.40 | 2.29 | 0.00 | 0.00 | 0.00 | 0.00 |
| __exp2f_finite | libm.so.6 | 0.58 | NA | 0.05 | NA | 0.05 | NA | 1 | NA | 4.98 | NA | 0.00 | NA | 0.00 | NA |
| unknown_function | binary | 0.29 | 0.12 | 0.02 | 0.01 | 0.03 | 0.01 | 1 | 1 | 6.10 | 23.63 | 0.00 | 0.00 | 0.00 | 0.00 |
| std::mersenne_twister_engine<unsigned long, 32ul, 624ul, 397ul, 31ul, 2567483615ul, 11ul, 4294967295ul, 7ul, 2636928640ul, 15ul, 4022730752ul, 18ul, 1812433253ul>::_M_gen_rand() | binary | 0.29 | NA | 0.02 | NA | 0.03 | NA | 1 | NA | 0.15 | NA | 0.00 | NA | 0.00 | NA |
| unknown_function | [vdso] | 0.06 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 | 1 | 1 | 0.50 | 0.50 | 0.00 | 0.00 | 0.00 | 0.00 |
| _int_free | libc.so.6 | 0.06 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| _int_malloc | libc.so.6 | NA | 0.06 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| el0_svc_common.constprop.0 | kernel | 0.06 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| get_random_bytes_user | kernel | 0.06 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |