Hi,
I was wondering if there are any documentations on how to analyze the ARM pipeline. I have access to thunderx2 nodes, and i'd like to make bottleneck analysis like can be done on intel chips. i can get the formulas to get compute the different metrics for a skylake here https://github.com/andikleen/pmu-tools/blob/master/skl_client_ratios.py. i checked the regular arm docs, ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile, and Programmer’s Guide for ARMv8-A and i did not find any information.
thanks,
if you share the formulas in this forum, or on some arm document, people can write the url in the reference portion of the white paper or other publications. that way people can share knowledge and make it easier to profile on aarch64. that is a win for everyone and ARM too. formulas for the main categories and subcategories. people want to see why their code is core bound, or memory bound. whether it is l1bound, l2 bound, external memory bound, etc.
thanks