Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel CPU type 0xb06e0 unknown #3685

Closed
thetooth opened this issue Feb 3, 2024 · 12 comments
Closed

Intel CPU type 0xb06e0 unknown #3685

thetooth opened this issue Feb 3, 2024 · 12 comments

Comments

@thetooth
Copy link

thetooth commented Feb 3, 2024

Built latest master branch and CPU is not listed. I think this is also related to #3338 this CPU is comprised of only E cores so high performance counters are not working in rr but are working elsewhere (the software being recorded under rr is a soft realtime robotics controller that relies on them), this is also under linux with rt patches so perhaps there is an issue with that I missed in the readme?

[FATAL src/PerfCounters_x86.h:128:compute_cpu_microarch()] Intel CPU type 0xb06e0 unknown

lscpu:

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Atom(TM) x7425E
    CPU family:          6
    Model:               190
    Thread(s) per core:  1
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            0
    Frequency boost:     enabled
    CPU(s) scaling MHz:  105%
    CPU max MHz:         1501.0000
    CPU min MHz:         800.0000
    BogoMIPS:            2995.20
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology non
                         stop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnow
                         prefetch cpuid_fault epb cat_l2 cdp_l2 ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a rdseed adx smap clflushopt clwb intel
                         _pt sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serializ
                         e arch_lbr ibt flush_l1d arch_capabilities

and so on....

After patching PerfCounters_x86.h with the new type value I get the following crash:

user@plc1:~$ /usr/local/bin/rr ./robot-ctrl
rr: Saving execution to trace directory `/home/user/.local/share/rr/robot-ctrl-3'.
[FATAL src/PerfCounters.cc:393:check_working_counters()]
Got 37 branch events, expected at least 500.

The hardware performance counter seems to not be working. Check
that hardware performance counters are working by running
  perf stat -e r5111c4 true
and checking that it reports a nonzero number of events.
If performance counters seem to be working with 'perf', file an
rr issue, otherwise check your hardware/OS/VM configuration. Also
check that other software is not using performance counters on
this CPU.
=== Start rr backtrace:
/usr/local/bin/rr(_ZN2rr13dump_rr_stackEv+0x2e)[0x555acf764e5e]
/usr/local/bin/rr(_ZN2rr15notifying_abortEv+0xe)[0x555acf764f2e]
/usr/local/bin/rr(+0x1f94ef)[0x555acf78b4ef]
/usr/local/bin/rr(_ZN2rr12PerfCounters5startEPNS_4TaskEl+0x117b)[0x555acf678dbb]
/usr/local/bin/rr(_ZN2rr4Task16resume_executionENS_13ResumeRequestENS_11WaitRequestENS_12TicksRequestEi+0x91d)[0x555acf73c40d]
/usr/local/bin/rr(_ZN2rr13RecordSession11record_stepEv+0x433)[0x555acf68fdd3]
/usr/local/bin/rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x10dd)[0x555acf6802fd]
/usr/local/bin/rr(main+0x180)[0x555acf5e0430]
/lib/x86_64-linux-gnu/libc.so.6(+0x271ca)[0x7f9379aa51ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f9379aa5285]
/usr/local/bin/rr(_start+0x21)[0x555acf5e2e41]
=== End rr backtrace
Aborted

Running the suggested test:

user@plc1:~$ perf stat -e r5111c4 true

 Performance counter stats for 'true':

            62,208      r5111c4

       0.000670960 seconds time elapsed

       0.000748000 seconds user
       0.000000000 seconds sys
@khuey
Copy link
Collaborator

khuey commented Feb 3, 2024

You probably need to use 0x517ec4 (copying the perf counter from other Atom CPUs) not 0x5111c4 (from the Core CPUs).

@thetooth
Copy link
Author

thetooth commented Feb 4, 2024

I tried adding an entry to the PMU config, the application starts under rr now but crashes after a few ms

Attaching to the debugger doesn't seem to reveal anything useful...

@rocallahan
Copy link
Collaborator

Can you dump perf list --details here?

@thetooth
Copy link
Author

thetooth commented Feb 5, 2024

@rocallahan
Copy link
Collaborator

Something is quite wrong with your system here. The only hardware events listed are related to "uncore", none of the normal CPU events are available. Is this in a VM guest?

@thetooth
Copy link
Author

thetooth commented Feb 5, 2024

No it's one of these, Atom x7425E CPU

@rocallahan
Copy link
Collaborator

It looks like Linux doesn't support the PMU hardware on this chip, assuming there is one.

@thetooth
Copy link
Author

thetooth commented Feb 5, 2024

Is there anything to be done to get around this? I'm trying to figure out why a crash is happening after 6-7hrs in my application and both gdb and lldb seem to silently detach themselves from the process after a couple of hours so I never see the cause.

@rocallahan
Copy link
Collaborator

It's not an rr issue. You need someone to do a deep dive into what it takes to get hardware PMU events working on your system. Try whoever you normally rely on for Linux support.

@khuey
Copy link
Collaborator

khuey commented Feb 5, 2024

Can you pastebin the dmesg output after booting this machine?

@thetooth
Copy link
Author

thetooth commented Feb 5, 2024

@khuey here you go

@rocallahan Uhh that would be me I think, this is a single developer hobby project after all 👀

@khuey
Copy link
Collaborator

khuey commented Feb 6, 2024

[    0.048998] Performance Events: XSAVE Architectural LBR, PEBS fmt4+-baseline, PEBS-via-PT,  AnyThread deprecated, Gracemont events, 32-deep LBR, full-width counters, Intel PMU driver.
[    0.048998] ... version:                5
[    0.048998] ... bit width:              48
[    0.048998] ... generic registers:      6
[    0.048998] ... value mask:             0000ffffffffffff
[    0.048998] ... max period:             00007fffffffffff
[    0.048998] ... fixed-purpose events:   3
[    0.048998] ... event mask:             000000070000003f

It doesn't seem like perf should be totally busted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants