| Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (coverage) | Deviation (time) |
| orig_default | gcc_default | gcc_1 | orig_default | gcc_default | gcc_1 | orig_default | gcc_default | gcc_1 | orig_default | gcc_default | gcc_1 | orig_default | gcc_default | gcc_1 | orig_default | gcc_default | gcc_1 |
| sgemm_sve_big | libarmpl_lp64_mp.so | 25.06 | 83.07 | NA | 9.31 | 9.33 | NA | 9.54 | 9.48 | NA | 64 | 64 | NA | 7.30 | 0.79 | NA | 0.31 | 0.32 | NA |
| sgemm_sve_big | libarmpl_lp64.so | NA | NA | 88.63 | NA | NA | 317.85 | NA | NA | 344.30 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | 60.52 | NA | NA | 22.49 | NA | NA | 23.40 | NA | NA | 64 | NA | NA | 7.61 | NA | NA | 2.81 | NA | NA |
| void armpl::clag::(anonymous namespace)::n_interleave_cntg_loop<16l, 16l, 0l, armpl::clag::(anonymous namespace)::step_val_fixed<1l>, unsigned long, float, float>(long, long, float const*, armpl::clag::(anonymous namespace)::step_val_fixed<1... | libarmpl_lp64_mp.so | 2.03 | 6.73 | NA | 0.75 | 0.76 | NA | 0.90 | 0.87 | NA | 64 | 64 | NA | 0.64 | 0.64 | NA | 0.06 | 0.06 | NA |
| kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | 7.20 | NA | NA | 2.67 | NA | NA | 3.00 | NA | NA | 64 | NA | NA | 0.98 | NA | NA | 0.36 | NA | NA |
| omp_get_num_procs | libgomp.so.1.0.0 | NA | 2.77 | 2.94 | NA | 0.31 | 10.53 | NA | 0.38 | 0.24 | NA | 64 | 64 | NA | 0.44 | 6.48 | NA | 0.05 | 0.03 |
| void armpl::clag::(anonymous namespace)::n_interleave_cntg_loop<12l, 12l, 0l, armpl::clag::(anonymous namespace)::step_val_fixed<1l>, unsigned long, float, float>(long, long, float const*, armpl::clag::(anonymous namespace)::step_val_fixed<1... | libarmpl_lp64_mp.so | 1.37 | 4.12 | NA | 0.51 | 0.46 | NA | 0.63 | 0.58 | NA | 64 | 64 | NA | 0.38 | 0.55 | NA | 0.07 | 0.07 | NA |
| void armpl::clag::(anonymous namespace)::n_interleave_cntg_loop<16l, 16l, 0l, armpl::clag::(anonymous namespace)::step_val_fixed<1l>, unsigned long, float, float>(long, long, float const*, armpl::clag::(anonymous namespace)::step_val_fixed<1... | libarmpl_lp64.so | NA | NA | 3.07 | NA | NA | 11.01 | NA | NA | 11.93 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| __GI___sched_yield | libc.so.6 | 2.12 | NA | NA | 0.79 | NA | NA | 0.91 | NA | NA | 63 | NA | NA | 0.16 | NA | NA | 0.06 | NA | NA |
| ggml_vec_dot_q8_0_q8_0 | libggml-cpu.so | 0.21 | 0.69 | 1.21 | 0.08 | 0.08 | 4.35 | 0.09 | 0.09 | 0.09 | 64 | 64 | 64 | 0.07 | 0.07 | 2.46 | 0.01 | 0.01 | 0.01 |
| ggml_compute_forward_flash_attn_ext | libggml-cpu.so | 0.14 | 0.49 | 0.91 | 0.05 | 0.06 | 3.27 | 0.08 | 0.09 | 0.09 | 63 | 63 | 63 | 0.04 | 0.11 | 2.80 | 0.01 | 0.01 | 0.01 |
| ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool) | libggml-cpu.so | 0.11 | 0.42 | 0.58 | 0.04 | 0.05 | 2.09 | 0.08 | 0.09 | 0.05 | 63 | 63 | 63 | 0.04 | 0.14 | 2.40 | 0.01 | 0.01 | 0.01 |
| ggml_vec_swiglu_f32 | libggml-cpu.so | 0.08 | 0.33 | 0.43 | 0.03 | 0.04 | 1.55 | 0.07 | 0.06 | 0.06 | 63 | 63 | 63 | 0.04 | 0.11 | 1.95 | 0.01 | 0.01 | 0.01 |
| unknown_function | [vdso] | 0.48 | NA | NA | 0.18 | NA | NA | 0.27 | NA | NA | 63 | NA | NA | 0.08 | NA | NA | 0.03 | NA | NA |
| sincosf | libm.so.6 | 0.04 | 0.15 | 0.25 | 0.02 | 0.02 | 0.89 | 0.04 | 0.06 | 0.04 | 63 | 58 | 62 | 0.02 | 0.09 | 1.77 | 0.01 | 0.01 | 0.01 |
| void armpl::clag::(anonymous namespace)::n_interleave_cntg_loop<12l, 12l, 0l, armpl::clag::(anonymous namespace)::step_val_fixed<1l>, unsigned long, float, float>(long, long, float const*, armpl::clag::(anonymous namespace)::step_val_fixed<1... | libarmpl_lp64.so | NA | NA | 0.43 | NA | NA | 1.55 | NA | NA | 1.68 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| rope_yarn(float, float, float*, long, float, float, float*, float*) | libggml-cpu.so | NA | NA | 0.38 | NA | NA | 1.35 | NA | NA | 0.05 | NA | NA | 63 | NA | NA | 2.04 | NA | NA | 0.01 |
| __expf_finite | libm.so.6 | 0.03 | 0.12 | 0.18 | 0.01 | 0.01 | 0.63 | 0.03 | 0.04 | 0.03 | 59 | 61 | 56 | 0.02 | 0.07 | 1.19 | 0.01 | 0.01 | 0.01 |
| ggml_compute_forward_rms_norm | libggml-cpu.so | 0.04 | 0.11 | 0.16 | 0.01 | 0.01 | 0.58 | 0.04 | 0.03 | 0.02 | 59 | 57 | 60 | 0.02 | 0.05 | 0.98 | 0.01 | 0.01 | 0.00 |
| ggml_compute_forward_add_non_quantized | libggml-cpu.so | 0.03 | 0.10 | 0.15 | 0.01 | 0.01 | 0.53 | 0.04 | 0.03 | 0.03 | 57 | 59 | 57 | 0.02 | 0.05 | 1.14 | 0.01 | 0.01 | 0.01 |
| ggml_compute_forward_mul | libggml-cpu.so | 0.03 | 0.09 | 0.17 | 0.01 | 0.01 | 0.60 | 0.04 | 0.03 | 0.04 | 53 | 49 | 55 | 0.02 | 0.07 | 1.54 | 0.01 | 0.01 | 0.01 |
| ggml_vec_dot_f16 | libggml-cpu.so | 0.02 | 0.09 | 0.16 | 0.01 | 0.01 | 0.57 | 0.02 | 0.03 | 0.03 | 52 | 58 | 51 | 0.01 | 0.05 | 1.45 | 0.00 | 0.01 | 0.01 |
| auto armpl::clag::parallelise_2d<true, true, armpl::clag::resident<(armpl::clag::which_matrix)1, armpl::clag::bcms<(armpl::clag::which_matrix)1, armpl::clag::(anonymous namespace)::buffer_pool<armpl::clag::bcms_thread_record<float> >, ... | libarmpl_lp64_mp.so | 0.04 | 0.23 | NA | 0.01 | 0.03 | NA | 0.04 | 0.06 | NA | 59 | 63 | NA | 0.02 | 0.10 | NA | 0.01 | 0.01 | NA |
| void armpl::clag::(anonymous namespace)::n_interleave_cntg_loop<8l, 12l, 0l, armpl::clag::(anonymous namespace)::step_val_fixed<1l>, unsigned long, float, float>(long, long, float const*, armpl::clag::(anonymous namespace)::step_val_fixed<1l... | libarmpl_lp64_mp.so | 0.05 | 0.18 | NA | 0.02 | 0.02 | NA | 0.07 | 0.11 | NA | 32 | 32 | NA | 0.04 | 0.20 | NA | 0.02 | 0.02 | NA |
| unknown_function | libomp.so | 0.17 | NA | NA | 0.06 | NA | NA | 0.10 | NA | NA | 63 | NA | NA | 0.05 | NA | NA | 0.02 | NA | NA |
| ggml_cpu_fp32_to_fp16 | libggml-cpu.so | 0.02 | 0.06 | 0.07 | 0.01 | 0.01 | 0.26 | 0.03 | 0.02 | 0.01 | 50 | 46 | 40 | 0.01 | 0.05 | 0.70 | 0.01 | 0.01 | 0.00 |
| __kmp_yield | libomp.so | 0.06 | NA | NA | 0.02 | NA | NA | 0.05 | NA | NA | 61 | NA | NA | 0.03 | NA | NA | 0.01 | NA | NA |
| auto armpl::clag::parallelise_2d<true, false, armpl::clag::resident<(armpl::clag::which_matrix)1, armpl::clag::pack<(armpl::clag::which_matrix)1, armpl::clag::(anonymous namespace)::buffer_pool<float>, armpl::clag::spec::convert<armpl::cl... | libarmpl_lp64_mp.so | 0.02 | 0.04 | NA | 0.01 | 0.00 | NA | 0.03 | 0.02 | NA | 46 | 37 | NA | 0.02 | 0.04 | NA | 0.01 | 0.00 | NA |
| __GI___memcpy_sve | libc.so.6 | 0.01 | 0.02 | 0.03 | 0.00 | 0.00 | 0.12 | 0.01 | 0.01 | 0.01 | 22 | 20 | 21 | 0.03 | 0.02 | 0.45 | 0.00 | 0.00 | 0.00 |
| unknown_function | libggml-cpu.so | 0.01 | 0.01 | 0.04 | 0.00 | 0.00 | 0.13 | 0.01 | 0.02 | 0.01 | 22 | 19 | 24 | 0.00 | 0.02 | 0.54 | 0.00 | 0.00 | 0.00 |
| __kmp_now_nsec | libomp.so | 0.06 | NA | NA | 0.02 | NA | NA | 0.04 | NA | NA | 62 | NA | NA | 0.03 | NA | NA | 0.01 | NA | NA |
| dequantize_row_q8_0 | libggml-base.so | 0.00 | 0.01 | 0.02 | 0.00 | 0.00 | 0.06 | 0.03 | 0.04 | 0.05 | 4 | 1 | 3 | 0.15 | 0.00 | 0.61 | 0.01 | 0.00 | 0.03 |
| void armpl::clag::n_cpp_interleave<16ul, 0l, float, float, armpl::clag::spec::sve_architecture_spec>(unsigned long, unsigned long, float const*, unsigned long, unsigned long, unsigned long, unsigned long, float*, unsigned long, long, long) | libarmpl_lp64_mp.so | 0.01 | 0.01 | NA | 0.00 | 0.00 | NA | 0.02 | 0.01 | NA | 28 | 17 | NA | 0.01 | 0.02 | NA | 0.00 | 0.00 | NA |
| ggml_compute_forward_mul_mat | libggml-cpu.so | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 0.04 | 0.01 | 0.01 | 0.01 | 7 | 10 | 7 | 0.00 | 0.02 | 0.39 | 0.00 | 0.00 | 0.00 |
| memset | ld-linux-aarch64.so.1 | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 0.04 | 0.06 | 0.04 | 0.04 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_graph_compute_thread | libggml-cpu.so | NA | 0.01 | 0.01 | NA | 0.00 | 0.03 | NA | 0.01 | 0.00 | NA | 12 | 7 | NA | 0.01 | 0.04 | NA | 0.00 | 0.00 |
| void armpl::clag::n_cpp_interleave<12ul, 0l, float, float, armpl::clag::spec::sve_architecture_spec>(unsigned long, unsigned long, float const*, unsigned long, unsigned long, unsigned long, unsigned long, float*, unsigned long, long, long) | libarmpl_lp64_mp.so | 0.00 | 0.01 | NA | 0.00 | 0.00 | NA | 0.01 | 0.01 | NA | 17 | 16 | NA | 0.00 | 0.02 | NA | 0.00 | 0.00 | NA |
| __GI___memset_generic | libc.so.6 | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 0.03 | 0.03 | 0.04 | 0.03 | 1 | 2 | 1 | 0.00 | 0.26 | 0.00 | 0.00 | 0.02 | 0.00 |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.03 | 0.03 | 0.04 | 0.04 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __GI___pthread_mutex_lock | libc.so.6 | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 0.02 | 0.01 | 0.01 | 0.00 | 6 | 8 | 4 | 0.00 | 0.02 | 0.51 | 0.00 | 0.00 | 0.00 |
| __GI___libc_malloc | libc.so.6 | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 0.02 | 0.01 | 0.04 | 0.02 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| quantize_row_q8_0 | libggml-cpu.so | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.03 | 0.00 | 0.00 | 0.00 | 1 | 2 | 7 | 0.00 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare(char const*) const | libstdc++.so.6.0.33 | NA | 0.00 | 0.01 | NA | 0.00 | 0.03 | NA | 0.02 | 0.03 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| llama_vocab::~llama_vocab() | libllama.so | NA | 0.00 | 0.01 | NA | 0.00 | 0.02 | NA | 0.02 | 0.02 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| void armpl::clag::n_cpp_interleave<16ul, 0l, float, float, armpl::clag::spec::sve_architecture_spec>(unsigned long, unsigned long, float const*, unsigned long, unsigned long, unsigned long, unsigned long, float*, unsigned long, long, long) | libarmpl_lp64.so | NA | NA | 0.01 | NA | NA | 0.03 | NA | NA | 0.04 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) | libstdc++.so.6.0.33 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.02 | 0.01 | 0.02 | 0.02 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| llama_vocab::impl::load(llama_model_loader&, LLM_KV const&) | libllama.so | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.02 | 0.04 | 0.01 | 0.02 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| _IO_fread | libc.so.6 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.02 | 0.01 | 0.01 | 0.02 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| GOMP_single_start | libgomp.so.1.0.0 | NA | 0.01 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 10 | NA | NA | 0.01 | NA | NA | 0.00 | NA |
| auto armpl::clag::execute_strategy<16ul, std::tuple<armpl::clag::matmul::set_or_scale, armpl::clag::matmul::compressed_general_matrix_vector, armpl::clag::matmul::symmetric_matrix_vector, armpl::clag::matmul::compressed_symmetric_matrix_vector, armpl... | libarmpl_lp64.so | NA | NA | 0.01 | NA | NA | 0.03 | NA | NA | 0.03 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| unknown_function | libllama.so | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.02 | 0.02 | 0.01 | 0.02 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| _int_free | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.03 | 0.01 | 0.01 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, unsigned char>, s... | libllama.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.02 | 0.01 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __aarch64_ldadd8_acq_rel | libarmpl_lp64_mp.so | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.01 | 0.00 | NA | 7 | 7 | NA | 0.01 | 0.00 | NA | 0.00 | 0.00 | NA |
| _int_malloc | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_cpu_get_sve_cnt | libggml-cpu.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | 0.00 | 1 | 3 | 3 | 0.00 | 0.00 | 0.05 | 0.00 | 0.00 | 0.00 |
| ggml_is_empty | libggml-base.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | 0.00 | 1 | 3 | 2 | 0.00 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 |
| __GI___strlen_asimd | libc.so.6 | NA | 0.00 | 0.00 | NA | 0.00 | 0.01 | NA | 0.00 | 0.01 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| ggml_compute_forward_set_rows | libggml-cpu.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | 0.00 | 2 | 2 | 2 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 |
| ggml_cpu_extra_compute_forward | libggml-cpu.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 4 | 3 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| omp_fulfill_event | libgomp.so.1.0.0 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 5 | NA | NA | 0.02 | NA | NA | 0.00 | NA |
| __GI_memcmp | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | 0.01 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __GI___pthread_create | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.01 | 0.01 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_compute_forward_glu | libggml-cpu.so | NA | 0.00 | 0.00 | NA | 0.00 | 0.01 | NA | 0.00 | 0.00 | NA | 2 | 2 | NA | 0.00 | 0.02 | NA | 0.00 | 0.00 |
| _dl_lookup_symbol_x | ld-linux-aarch64.so.1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.01 | 2 | 1 | 1 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| malloc_consolidate | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.01 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| syscall | libc.so.6 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 5 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| @plt_start@ | libstdc++.so.6.0.33 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.01 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| std::__detail::_Map_base<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, st... | libllama.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_graph_compute_thread | libggml-cpu.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 13 | NA | NA | 0.01 | NA | NA | 0.00 | NA | NA |
| unlink_chunk.constprop.0 | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| _dl_fixup | ld-linux-aarch64.so.1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 2 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long) | libstdc++.so.6.0.33 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.01 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| __GI___pthread_getattr_default_np | libc.so.6 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| __log2_finite | libm.so.6 | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.01 | 0.01 | NA | 2 | 3 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA |
| ggml_compute_forward_view | libggml-cpu.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| @plt_start@ | libarmpl_lp64_mp.so | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.01 | 0.00 | NA | 8 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA |
| __kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 10 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml::cpu::kleidiai::extra_buffer_type::get_tensor_traits(ggml_tensor const*) | libggml-cpu.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| unicode_cpt_from_utf8(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long&) | libllama.so | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| __calloc | libc.so.6 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| llama_vocab::impl::token_to_piece(int, char*, int, int, bool) const | libllama.so | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| ggml_type_size | libggml-base.so | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| __kmp_barrier | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 8 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __GI___mprotect | libc.so.6 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.01 | NA | 0.00 | 1 | NA | 1 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 |
| __aarch64_ldadd8_acq_rel | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 8 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| replace_all(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<... | libllama.so | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 1 | NA | 1 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 |
| gguf_kv_to_str[abi:cxx11](gguf_context const*, int) | libllama.so | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 1 | NA | 1 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 |
| ggml_can_repeat | libggml-base.so | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.01 | NA | 0.00 | 1 | NA | 1 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 |
| __GI___lll_lock_wait | libc.so.6 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 1 | NA | 1 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 |
| kmp_flag_native<unsigned long long, (flag_type)1, true>::done_check() | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.02 | NA | NA | 5 | NA | NA | 0.01 | NA | NA | 0.00 | NA | NA |
| __GI___mmap | libc.so.6 | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.01 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA |
| __vfscanf_internal | libc.so.6 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| ggml_backend_sched_backend_id_from_cur(ggml_backend_sched*, ggml_tensor*) | libggml-base.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| __GI__dl_allocate_tls_init | ld-linux-aarch64.so.1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| _IO_file_xsgetn | libc.so.6 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| llama_model::load_tensors(llama_model_loader&)::{lambda(LLM_TN_IMPL const&, std::initializer_list<long> const&, int)#4}::operator()(LLM_TN_IMPL const&, std::initializer_list<long> const&, int) const [clone .constprop.0] | libllama.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| llama_kv_cache::prepare(std::vector<llama_ubatch, std::allocator<llama_ubatch> > const&) | libllama.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| void armpl::clag::(anonymous namespace)::n_interleave_cntg_loop<8l, 12l, 0l, armpl::clag::(anonymous namespace)::step_val_fixed<1l>, unsigned long, float, float>(long, long, float const*, armpl::clag::(anonymous namespace)::step_val_fixed<1l... | libarmpl_lp64.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| __GI___clone | libc.so.6 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| __aarch64_ldadd8_acq_rel | libarmpl_lp64.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| ggml_graph_compute._omp_fn.0 | libggml-cpu.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| ggml_nbytes | libggml-base.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 |
| __kmp_invoke_task_func | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 6 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml_graph_compute.omp_outlined | libggml-cpu.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 5 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __logf_finite | libm.so.6 | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.01 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA |
| memcpy | ld-linux-aarch64.so.1 | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA | 1 | 1 | NA | 0.00 | 0.00 | NA | 0.00 | 0.00 | NA |
| __kmp_hyper_barrier_gather(barrier_type, kmp_info*, int, int, void (*)(void*, void*), void*) | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 4 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| void armpl::clag::parallel<armpl::clag::parallelise_2d<true, false, armpl::clag::resident<(armpl::clag::which_matrix)1, armpl::clag::pack<(armpl::clag::which_matrix)1, armpl::clag::(anonymous namespace)::buffer_pool<float>, armpl::clag::s... | libarmpl_lp64_mp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 3 | NA | NA | 0.01 | NA | NA | 0.00 | NA | NA |
| __kmp_fork_barrier(int, int) | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 4 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml_kleidiai_select_kernels(cpu_feature, ggml_tensor const*) | libggml-cpu.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| ggml_is_numa | libggml-cpu.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| gguf_get_arr_str | libggml-base.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| gguf_data_to_str(gguf_type, void const*, int) | libllama.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| __GI_memcpy | libc.so.6 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| __powf_finite | libm.so.6 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| std::_Hash_bytes(void const*, unsigned long, unsigned long) | libstdc++.so.6.0.33 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| ggml::cpu::repack::extra_buffer_type::get_tensor_traits(ggml_tensor const*) | libggml-cpu.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| omp_get_thread_num | libgomp.so.1.0.0 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .constprop.0] | libllama.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| std::pair<std::_Rb_tree_iterator<int>, bool> std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_unique<int const&>(int const&) | libllama.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| __GI_memchr | libc.so.6 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .constprop.0] | libggml-base.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| unicode_utf8_to_byte(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) | libllama.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| _armpl_ZNSt7__cxx1110moneypunctIcLb0EE24_M_initialize_moneypunctEP15__locale_structPKc | libarmpl_lp64_mp.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| void armpl::clag::parallel<armpl::clag::parallelise_2d<true, true, armpl::clag::resident<(armpl::clag::which_matrix)1, armpl::clag::bcms<(armpl::clag::which_matrix)1, armpl::clag::(anonymous namespace)::buffer_pool<armpl::clag::bcms_thread_r... | libarmpl_lp64_mp.so | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::find(char const*, unsigned long, unsigned long) const | libstdc++.so.6.0.33 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| omp_get_num_threads | libgomp.so.1.0.0 | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
| __kmp_join_barrier(int) | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 3 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_get_global_thread_id_reg | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 3 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_finish_implicit_task | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 3 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_api_omp_get_thread_num | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 3 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_GOMP_microtask_wrapper(int*, int*, void (*)(void*), void*) | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 3 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_invoke_microtask | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 2 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_task_team_sync | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.01 | NA | NA | 2 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __aarch64_cas4_acq | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| unicode_cpt_to_utf8[abi:cxx11](unsigned int) | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::ostream::put(char) | libstdc++.so.6.0.33 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| llama_batch_allocr::init(llama_batch const&, llama_vocab const&, llama_memory_i const*, unsigned int, unsigned int, bool) | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::pair<std::__detail::_Node_iterator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<c... | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml_compute_forward_reshape | libggml-cpu.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml_compute_forward_rope | libggml-cpu.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_enter_single | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| operator new(unsigned long) | libstdc++.so.6.0.33 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __aarch64_ldadd4_acq_rel | libggml-blas.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml_nrows | libggml-base.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| ggml_is_contiguous_1 | libggml-base.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| allocate_dtv | ld-linux-aarch64.so.1 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_init_implicit_task | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) | libstdc++.so.6.0.33 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| _ZNSt4pairINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_EC2IRS5_S8_TnNSt9enable_ifIXaaclsr5_PCCPE22_MoveConstructiblePairIT_T0_EEclsr5_PCCPE30_ImplicitlyMoveConvertiblePairISA_SB_EEEbE4typeELb1EEEOSA_OSB_ | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::alloca... | libllama.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| std::__future_base::_Result_base::_Result_base() | libstdc++.so.6.0.33 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __strchrnul | libc.so.6 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| create_thread | libc.so.6 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __GI___pthread_mutex_unlock_usercnt | libc.so.6 | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
| __kmp_api_omp_get_num_threads | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |