Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cpu_type for Intel Alder Lake. #2997

Merged
merged 1 commit into from
Dec 23, 2021

Conversation

gcp
Copy link
Contributor

@gcp gcp commented Nov 30, 2021

This adds the cpu_type definition, which is enough to make most 64-bits tests pass. All 32-bit tests fail with

1358: rr: ../src/ExtraRegisters.cc:665: void rr::ExtraRegisters::reset(): Assertion `d.xsave_feature_bit == PKRU_FEATURE_BIT' failed.

Still investigating that.

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

In case it's relevant. I noticed the "XSAVE Architectural LBR".

[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: xstate_offset[9]:  832, xstate_sizes[9]:    8
[    0.000000] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format.
...
[    0.134247] Performance Events: XSAVE Architectural LBR, PEBS fmt4+-baseline,  AnyThread deprecated, Alderlake Hybrid events, 32-deep LBR, full-width counters, Intel PMU driver.
[    0.134338] core: cpu_core PMU driver: 
[    0.134339] ... version:                5
[    0.134340] ... bit width:              48
[    0.134340] ... generic registers:      8
[    0.134340] ... value mask:             0000ffffffffffff
[    0.134341] ... max period:             00007fffffffffff
[    0.134342] ... fixed-purpose events:   4
[    0.134342] ... event mask:             0001000f000000ff
[    0.134362] rcu: Hierarchical SRCU implementation.
[    0.134495] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[    0.134495] smp: Bringing up secondary CPUs ...
[    0.134495] x86: Booting SMP configuration:
[    0.134495] .... node  #0, CPUs:        #1  #2  #3  #4  #5  #6  #7  #8  #9 #10 #11 #12 #13 #14 #15 #16
[    0.004527] core: cpu_atom PMU driver: PEBS-via-PT 
[    0.004527] ... version:                5
[    0.004527] ... bit width:              48
[    0.004527] ... generic registers:      6
[    0.004527] ... value mask:             0000ffffffffffff
[    0.004527] ... max period:             00007fffffffffff
[    0.004527] ... fixed-purpose events:   3
[    0.004527] ... event mask:             000000070000003f

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

This looks a bit suspicious because it only seems to set the PKRU flag in 64-bit mode, but my guess at adding || regno == DREG_PKRU earlier clearly isn't enough:

if (regno == DREG_64_PKRU) {

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

Disabling the PKRU saving code where that assertion is in, and then running with taskset -c 0-15 (P-cores only) yields:

99% tests passed, 28 tests failed out of 2713

Total Test time (real) = 590.11 sec

The following tests FAILED:
        438 - x86/pkeys (Failed)
        439 - x86/pkeys-no-syscallbuf (Failed)
        1021 - perf_event_mmap-no-syscallbuf (Failed)
        1240 - nested_detach (Failed)
        1243 - nested_detach_kill-no-syscallbuf (Failed)
        1794 - x86/pkeys-32 (Failed)
        1795 - x86/pkeys-32-no-syscallbuf (Failed)
        2192 - alternate_thread_diversion-32 (Failed)
        2234 - call_function-32 (Failed)
        2235 - call_function-32-no-syscallbuf (Failed)
        2236 - call_gettid-32 (Failed)
        2237 - call_gettid-32-no-syscallbuf (Failed)
        2250 - conditional_breakpoint_calls-32 (Failed)
        2251 - conditional_breakpoint_calls-32-no-syscallbuf (Failed)
        2262 - crash_in_function-32 (Failed)
        2263 - crash_in_function-32-no-syscallbuf (Failed)
        2270 - diversion_sigtrap-32 (Failed)
        2271 - diversion_sigtrap-32-no-syscallbuf (Failed)
        2273 - diversion_syscall-32-no-syscallbuf (Failed)
        2377 - perf_event_mmap-32-no-syscallbuf (Failed)
        2547 - call_exit-32-no-syscallbuf (Failed)
        2615 - remove_watchpoint-32-no-syscallbuf (Failed)
        2624 - restart_diversion-32 (Failed)
        2625 - restart_diversion-32-no-syscallbuf (Failed)
        2650 - run_in_function-32 (Failed)
        2651 - run_in_function-32-no-syscallbuf (Failed)
        2700 - unwind_on_signal-32 (Failed)
        2701 - unwind_on_signal-32-no-syscallbuf (Failed)

Without taskset there's much more failures and it just hangs endlessly in some tests.

@rocallahan
Copy link
Collaborator

This looks a bit suspicious because it only seems to set the PKRU flag in 64-bit mode

We should use something like this right?

@rocallahan
Copy link
Collaborator

Probably should factor out the code to get the right GdbRegister value for PKRU for the arch.

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

This looks a bit suspicious because it only seems to set the PKRU flag in 64-bit mode

We should use something like this right?

Actually, the code converts it here

regno = DREG_64_PKRU;

So in the line you indicated, it probably shouldn't do that ternary?

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

With that fix:

99% tests passed, 13 tests failed out of 2713

Total Test time (real) = 311.32 sec

The following tests FAILED:
        438 - x86/pkeys (Failed)
        439 - x86/pkeys-no-syscallbuf (Failed)
        1012 - nested_detach_wait (Failed)
        1013 - nested_detach_wait-no-syscallbuf (Failed)
        1021 - perf_event_mmap-no-syscallbuf (Failed)
        1242 - nested_detach_kill (Failed)
        1794 - x86/pkeys-32 (Failed)
        1795 - x86/pkeys-32-no-syscallbuf (Failed)
        2262 - crash_in_function-32 (Failed)
        2263 - crash_in_function-32-no-syscallbuf (Failed)
        2368 - nested_detach_wait-32 (Failed)
        2377 - perf_event_mmap-32-no-syscallbuf (Failed)
        2599 - nested_detach_kill-32-no-syscallbuf (Failed)

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

Running some of the failing tests as root makes them pass.

My system now has a bunch of stale /usr/bin/bash source_dir/src/test/dconf_mock.run dconf_mock_32 -n bin_dir 120 though, so maybe I want to reboot.

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

Second attempt:

Looks like some of the failures at least are intermittent.

99% tests passed, 15 tests failed out of 2713

Total Test time (real) = 2422.40 sec

The following tests FAILED:
        187 - futex_exit_race_sigsegv-no-syscallbuf (Failed)
        438 - x86/pkeys (Failed)
        439 - x86/pkeys-no-syscallbuf (Failed)
        446 - prctl_caps (Failed)
        447 - prctl_caps-no-syscallbuf (Failed)
        1021 - perf_event_mmap-no-syscallbuf (Failed)
        1543 - futex_exit_race_sigsegv-32-no-syscallbuf (Failed)
        1791 - pid_ns_shutdown-32-no-syscallbuf (Failed)
        1794 - x86/pkeys-32 (Failed)
        1795 - x86/pkeys-32-no-syscallbuf (Failed)
        1802 - prctl_caps-32 (Failed)
        1803 - prctl_caps-32-no-syscallbuf (Failed)
        2262 - crash_in_function-32 (Failed)
        2263 - crash_in_function-32-no-syscallbuf (Failed)
        2377 - perf_event_mmap-32-no-syscallbuf (Failed)

@khuey
Copy link
Collaborator

khuey commented Nov 30, 2021

perf_event_mmap*-no-syscallbuf is a known issue.

The most interesting ones are probably x86/pkeys and prctl_caps.

Based on those failures though I would expect that rr largely works on your system at this point and there's just a few minor issues to deal with.

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

The most interesting ones are probably x86/pkeys and prctl_caps.

prctl_caps passed the first time, so this looks intermittent.

@khuey
Copy link
Collaborator

khuey commented Nov 30, 2021

Does objdir/bin/pkeys run successfully (print "EXIT-SUCCESS") on your system outside of rr? What happens if you rr record it? And then rr replay it?

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

prctl_caps fails as root only, works as a user.

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

crash-in-function-32 fails due to

2263: 
2263: Program received signal SIGSEGV, Segmentation fault.
2263: c
2263: crash_null_deref () at 32/util.h:305
2263: 305       inline static void crash_null_deref(void) { *(volatile int*)NULL = 0; }
2263: The program being debugged was signaled while in a function called from GDB.
2263: GDB remains in the frame where the signal was received.
2263: To change this behavior use "set unwindonsignal on".
2263: Evaluation of the expression containing the function
2263: (crash) will be abandoned.
2263: When the function is done executing, GDB will silently stop.
2263: (rr) c
Continuing.
2263: EXIT-SUCCESS
2263: [FATAL ../src/ReplaySession.cc:509:cont_syscall_boundary()] 
2263:  (task 426364 (rec:426359) at time 173)
2263:  -> Assertion `false' failed to hold. Replay got unrecorded signal {signo:SIGSEGV,errno:SUCCESS,code:SEGV_MAPERR,addr:0xcc5bf31f}

@gcp
Copy link
Contributor Author

gcp commented Nov 30, 2021

Does objdir/bin/pkeys run successfully (print "EXIT-SUCCESS") on your system outside of rr? What happens if you rr record it? And then rr replay it?

morbo@alderla:~/git/rr/build$ bin/pkeys
EXIT-SUCCESS
morbo@alderla:~/git/rr/build$ rr record bin/pkeys
rr: Saving execution to trace directory `/home/morbo/.local/share/rr/pkeys-0'.
EXIT-SUCCESS
(rr) cont
Continuing.
[FATAL ../src/ReplaySession.cc:1105:check_ticks_consistency()] 
 (task 426700 (rec:426694) at time 193)
 -> Assertion `ticks_now == trace_ticks' failed to hold. ticks mismatch for 'PATCH_SYSCALL'; expected 31170, got 31379
Tail of trace dump:
{
  real_time:11504.116780 global_time:173, event:`SYSCALL: prlimit64' (state:ENTERING_SYSCALL) tid:426694, ticks:29927
rax:0xffffffffffffffda rbx:0x1 rcx:0xffffffffffffffff rdx:0x0 rsi:0x3 rdi:0x0 rbp:0x7ffca369ef80 rsp:0x7ffca369ec78 r8:0x0 r9:0xc r10:0x7ffca369ec80 r11:0x246 r12:0x0 r13:0xfffffffffffffff8 r14:0x7ff4aabb2220 r15:0x7ff4aabb2220 rip:0x7ff4aaa423a4 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x12e fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.116795 global_time:174, event:`SYSCALL: prlimit64' (state:EXITING_SYSCALL) tid:426694, ticks:29927
rax:0x0 rbx:0x1 rcx:0xffffffffffffffff rdx:0x0 rsi:0x3 rdi:0x0 rbp:0x7ffca369ef80 rsp:0x7ffca369ec78 r8:0x0 r9:0xc r10:0x7ffca369ec80 r11:0x246 r12:0x0 r13:0xfffffffffffffff8 r14:0x7ff4aabb2220 r15:0x7ff4aabb2220 rip:0x7ff4aaa423a4 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x12e fs_base:0x7ff4aa922740 gs_base:0x0
  { tid:426694, addr:0x7ffca369ec80, length:0x10 }
}
{
  real_time:11504.116849 global_time:175, event:`SYSCALL: munmap' (state:ENTERING_SYSCALL) tid:426694, ticks:30056
rax:0xffffffffffffffda rbx:0x7ff4aabb2100 rcx:0xffffffffffffffff rdx:0x7ff4aa9b9b00 rsi:0x1a65d rdi:0x7ff4aab4d000 rbp:0x7ffca369ef80 rsp:0x7ffca369ec98 r8:0x0 r9:0xc r10:0xfffffffffffffb88 r11:0x246 r12:0x0 r13:0xfffffffffffffff8 r14:0x7ff4aabb2220 r15:0x7ff4aabb2220 rip:0x7ff4aaba040b eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xb fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.116870 global_time:176, event:`SYSCALL: munmap' (state:EXITING_SYSCALL) tid:426694, ticks:30056
rax:0x0 rbx:0x7ff4aabb2100 rcx:0xffffffffffffffff rdx:0x7ff4aa9b9b00 rsi:0x1a65d rdi:0x7ff4aab4d000 rbp:0x7ffca369ef80 rsp:0x7ffca369ec98 r8:0x0 r9:0xc r10:0xfffffffffffffb88 r11:0x246 r12:0x0 r13:0xfffffffffffffff8 r14:0x7ff4aabb2220 r15:0x7ff4aabb2220 rip:0x7ff4aaba040b eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xb fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.116924 global_time:177, event:`SYSCALL: rrcall_init_preload' (state:ENTERING_SYSCALL) tid:426694, ticks:30394
rax:0xffffffffffffffda rbx:0x7ffca369ee80 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x7ffca369ee20 rbp:0x7ffca369ee20 rsp:0x7ffca369ede0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x7ffca369f098 r13:0x7ffca369f0a8 r14:0x7ff4aab72e50 r15:0x0 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3e8 fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.116958 global_time:178, event:`SYSCALL: rrcall_init_preload' (state:EXITING_SYSCALL) tid:426694, ticks:30394
rax:0x0 rbx:0x7ffca369ee80 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x7ffca369ee20 rbp:0x7ffca369ee20 rsp:0x7ffca369ede0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x7ffca369f098 r13:0x7ffca369f0a8 r14:0x7ff4aab72e50 r15:0x0 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3e8 fs_base:0x7ff4aa922740 gs_base:0x0
  { tid:426694, addr:0x7ff4aab73228, length:0x400 }
  { tid:426694, addr:0x7ff4aab73222, length:0x1 }
  { tid:426694, addr:0x7ff4aab73224, length:0x4 }
  { tid:426694, addr:0x7ff4aab73223, length:0x1 }
  { tid:426694, addr:0x7ff4aab793e8, length:0x8 }
}
{
  real_time:11504.117009 global_time:179, event:`SYSCALL: rrcall_init_preload' (state:ENTERING_SYSCALL) tid:426694, ticks:30394
rax:0xffffffffffffffda rbx:0x7ffca369ee80 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x7ffca369ee20 rbp:0x7ffca369ee20 rsp:0x7ffca369ede0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x7ffca369f098 r13:0x7ffca369f0a8 r14:0x7ff4aab72e50 r15:0x0 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3e8 fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117035 global_time:180, event:`SYSCALL: rrcall_init_preload' (state:EXITING_SYSCALL) tid:426694, ticks:30394
rax:0x0 rbx:0x7ffca369ee80 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x7ffca369ee20 rbp:0x7ffca369ee20 rsp:0x7ffca369ede0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x7ffca369f098 r13:0x7ffca369f0a8 r14:0x7ff4aab72e50 r15:0x0 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3e8 fs_base:0x7ff4aa922740 gs_base:0x0
  { tid:426694, addr:0x7ff4aab73228, length:0x400 }
  { tid:426694, addr:0x7ff4aab73222, length:0x1 }
  { tid:426694, addr:0x7ff4aab73224, length:0x4 }
  { tid:426694, addr:0x7ff4aab73223, length:0x1 }
  { tid:426694, addr:0x7ff4aab793e8, length:0x8 }
}
{
  real_time:11504.117090 global_time:181, event:`INSTRUCTION_TRAP' tid:426694, ticks:31136
rax:0x1 rbx:0x2398a7eb rcx:0x98c027bc rdx:0xfc1cc410 rsi:0x0 rdi:0x7 rbp:0x7ffca369ef30 rsp:0x7ffca369ef28 r8:0x7ffca369ef54 r9:0x7ff4aab8dd00 r10:0xfffffffffffff6fe r11:0x4 r12:0x7ffca369f098 r13:0x55ef93019565 r14:0x0 r15:0x7ff4aabafc40 rip:0x55ef930194cd eflags:0x10246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xffffffffffffffff fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117175 global_time:182, event:`PATCH_SYSCALL' tid:426694, ticks:31139
rax:0x14a rbx:0x0 rcx:0xffffffffffffffff rdx:0x55ef9301a06a rsi:0x0 rdi:0x0 rbp:0x7ffca369ef70 rsp:0x7ffca369ef38 r8:0x7ffca369ef54 r9:0x7ff4aab8dd00 r10:0xfffffffffffff6fe r11:0x246 r12:0x7ffca369f098 r13:0x55ef93019565 r14:0x0 r15:0x7ff4aabafc40 rip:0x7ff4aaa4ed99 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xffffffffffffffff fs_base:0x7ff4aa922740 gs_base:0x0
  { map_file:"<ZERO>", addr:0x7ff4aab4e000, length:0x1000, prot_flags:"r-xp", file_offset:0x0, device:0, inode:0, data_file:"", data_offset:0x0, file_size:0x1000 }
  { tid:426694, addr:0x7ff4aab4e000, length:0x4f }
  { tid:426694, addr:0x7ff4aaa4ed99, length:0x5 }
  { tid:426694, addr:0x7ff4aaa4ed9e, length:0x3 }
}
{
  real_time:11504.117216 global_time:183, event:`SYSCALL: gettid' (state:ENTERING_SYSCALL) tid:426694, ticks:31143
rax:0xffffffffffffffda rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x0 rbp:0x7ffca369ef70 rsp:0x681ffdf0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x7ffca369f098 r13:0x55ef93019565 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xba fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117228 global_time:184, event:`SYSCALL: gettid' (state:EXITING_SYSCALL) tid:426694, ticks:31143
rax:0x682c6 rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x0 rbp:0x7ffca369ef70 rsp:0x681ffdf0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x7ffca369f098 r13:0x55ef93019565 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xba fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117277 global_time:185, event:`SYSCALL: perf_event_open' (state:ENTERING_SYSCALL) tid:426694, ticks:31143
rax:0xffffffffffffffda rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0xffffffffffffffff rsi:0x0 rdi:0x681ffe60 rbp:0x681ffe60 rsp:0x681ffdf0 r8:0x0 r9:0x0 r10:0xffffffffffffffff r11:0x246 r12:0x7ffca369f098 r13:0x682c6 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x12a fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117498 global_time:186, event:`SYSCALL: perf_event_open' (state:EXITING_SYSCALL) tid:426694, ticks:31143
rax:0x3 rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0xffffffffffffffff rsi:0x0 rdi:0x681ffe60 rbp:0x681ffe60 rsp:0x681ffdf0 r8:0x0 r9:0x0 r10:0xffffffffffffffff r11:0x246 r12:0x7ffca369f098 r13:0x682c6 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x12a fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117549 global_time:187, event:`SYSCALL: fcntl' (state:ENTERING_SYSCALL) tid:426694, ticks:31144
rax:0xffffffffffffffda rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x64 rsi:0x406 rdi:0x3 rbp:0x681ffe60 rsp:0x681ffd90 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x3 r13:0x682c6 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x48 fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117566 global_time:188, event:`SYSCALL: fcntl' (state:EXITING_SYSCALL) tid:426694, ticks:31144
rax:0x64 rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x64 rsi:0x406 rdi:0x3 rbp:0x681ffe60 rsp:0x681ffd90 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x3 r13:0x682c6 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x48 fs_base:0x7ff4aa922740 gs_base:0x0
  { tid:426694, addr:0x7ff4aab7328c, length:0x1 }
}
{
  real_time:11504.117618 global_time:189, event:`SYSCALL: rrcall_init_buffers' (state:ENTERING_SYSCALL) tid:426694, ticks:31149
rax:0xffffffffffffffda rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x681ffe60 rbp:0x681ffe60 rsp:0x681ffdf0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x3 r13:0x682c6 r14:0x64 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3e9 fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117753 global_time:190, event:`SYSCALL: rrcall_init_buffers' (state:EXITING_SYSCALL) tid:426694, ticks:31149
rax:0x7ff4aa822000 rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x0 rdi:0x681ffe60 rbp:0x681ffe60 rsp:0x681ffdf0 r8:0x0 r9:0x0 r10:0x0 r11:0x246 r12:0x3 r13:0x682c6 r14:0x64 r15:0x7ff4aabafc40 rip:0x70000005 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x3e9 fs_base:0x7ff4aa922740 gs_base:0x0
  { map_file:"<ZERO>", addr:0x7ff4aa822000, length:0x100000, prot_flags:"rw-s", file_offset:0x0, device:64769, inode:135274, data_file:"", data_offset:0x0, file_size:0x100000 }
  { tid:426694, addr:0x7ff4aab7328c, length:0x1 }
  { tid:426694, addr:0x681ffe60, length:0x20 }
}
{
  real_time:11504.117816 global_time:191, event:`SYSCALL: pkey_alloc' (state:ENTERING_SYSCALL) tid:426694, ticks:31155
rax:0xffffffffffffffda rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x55ef9301a06a rsi:0x0 rdi:0x0 rbp:0x681ffe60 rsp:0x681ffdf0 r8:0x7ffca369ef54 r9:0x7ff4aab8dd00 r10:0xfffffffffffff6fe r11:0x246 r12:0x3 r13:0x682c6 r14:0x64 r15:0x7ff4aabafc40 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x14a fs_base:0x7ff4aa922740 gs_base:0x0
}
{
  real_time:11504.117830 global_time:192, event:`SYSCALL: pkey_alloc' (state:EXITING_SYSCALL) tid:426694, ticks:31155
rax:0x1 rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x55ef9301a06a rsi:0x0 rdi:0x0 rbp:0x681ffe60 rsp:0x681ffdf0 r8:0x7ffca369ef54 r9:0x7ff4aab8dd00 r10:0xfffffffffffff6fe r11:0x246 r12:0x3 r13:0x682c6 r14:0x64 r15:0x7ff4aabafc40 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x14a fs_base:0x7ff4aa922740 gs_base:0x0 st0:0x0 st1:0x0 st2:0x0 st3:0x0 st4:0x0 st5:0x0 st6:0x0 st7:0x0 ymm0:0x0 ymm1:0x101010101010100 ymm2:0x7ff4aab7a530 ymm3:0x5f00 ymm4:0x6c616e7265746e695f666f747274735f ymm5:0x1 ymm6:0x1 ymm7:0x0 ymm8:0x0 ymm9:0x0 ymm10:0x0 ymm11:0x0 ymm12:0x0 ymm13:0x0 ymm14:0x0 ymm15:0x0
}
{
  real_time:11504.117906 global_time:193, event:`PATCH_SYSCALL' tid:426694, ticks:31170
rax:0x9 rbx:0x22 rcx:0xffffffffffffffff rdx:0x3 rsi:0x1000 rdi:0x0 rbp:0x0 rsp:0x7ffca369ef28 r8:0xffffffff r9:0x0 r10:0x22 r11:0x246 r12:0x7ffca369f098 r13:0x55ef93019565 r14:0x0 r15:0x7ff4aabafc40 rip:0x7ff4aaa46af4 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xffffffffffffffff fs_base:0x7ff4aa922740 gs_base:0x0
  { tid:426694, addr:0x7ff4aab4e04f, length:0x4f }
  { tid:426694, addr:0x7ff4aaa46af4, length:0x5 }
  { tid:426694, addr:0x7ff4aaa46af9, length:0x3 }
}
{
  real_time:11504.117943 global_time:194, event:`SYSCALL: mmap' (state:ENTERING_SYSCALL) tid:426694, ticks:31178
rax:0xffffffffffffffda rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x3 rsi:0x1000 rdi:0x0 rbp:0x0 rsp:0x681ffdf0 r8:0xffffffff r9:0x0 r10:0x22 r11:0x246 r12:0x7ffca369f098 r13:0x55ef93019565 r14:0x0 r15:0x7ff4aabafc40 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x9 fs_base:0x7ff4aa922740 gs_base:0x0
}
=== Start rr backtrace:
rr(_ZN2rr13dump_rr_stackEv+0x60)[0x5566c4aa5986]
rr(_ZN2rr9GdbServer15emergency_debugEPNS_4TaskE+0x1b2)[0x5566c48adf02]
rr(+0x3d5884)[0x5566c48e7884]
rr(_ZN2rr21EmergencyDebugOstreamD1Ev+0x63)[0x5566c48e7b0d]
rr(_ZN2rr13ReplaySession23check_ticks_consistencyEPNS_10ReplayTaskERKNS_5EventE+0x143)[0x5566c49e02bf]
rr(_ZN2rr13ReplaySession11replay_stepERKNS0_15StepConstraintsE+0x8dc)[0x5566c49e4388]
rr(_ZN2rr14ReplayTimeline19replay_step_forwardENS_10RunCommandEl+0x113)[0x5566c4a03139]
rr(_ZN2rr9GdbServer14debug_one_stepERNS_10GdbRequestE+0x5b4)[0x5566c48aacac]
rr(_ZN2rr9GdbServer12serve_replayERKNS0_15ConnectionFlagsE+0x561)[0x5566c48acf5f]
rr(+0x4c6bd5)[0x5566c49d8bd5]
rr(_ZN2rr13ReplayCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x41f)[0x5566c49d96f3]
rr(main+0x278)[0x5566c4ac1467]
/lib/x86_64-linux-gnu/libc.so.6(+0x2dfd0)[0x7f58e5456fd0]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x7d)[0x7f58e545707d]
rr(_start+0x25)[0x5566c48006d5]
=== End rr backtrace
Launch gdb with
  gdb '-l' '10000' '-ex' 'set sysroot /' '-ex' 'target extended-remote 127.0.0.1:33484' /home/morbo/.local/share/rr/pkeys-0/mmap_hardlink_4_pkeys

@rocallahan
Copy link
Collaborator

Does pkeys always fail in exactly the same way?
What do you get when you attach to the emergency debugger? What's the tracee stack trace?

@gcp
Copy link
Contributor Author

gcp commented Dec 21, 2021

Does pkeys always fail in exactly the same way?

Yes

What do you get when you attach to the emergency debugger? What's the tracee stack trace?

(gdb) bt
#0  0x00007fb1a7cdd9b5 in __GI___libc_write (fd=1, buf=0x7ffcc5a1fe50, nbytes=26)
    at ../sysdeps/unix/sysv/linux/write.c:26
#1  0x000055f4d22df386 in atomic_printf (fmt=0x55f4d22e000c "FAILED: errno=%d (%s)\n")
    at ../src/test/x86/../util.h:165
#2  0x000055f4d22df40b in check_cond (cond=0) at ../src/test/x86/../util.h:183
#3  0x000055f4d22df42d in atomic_assert (cond=0, 
    str=0x55f4d22e0108 "(initial_pkru & ~(PKEY_DISABLE_ACCESS << (2 * pkey ))) == modified_pkru")
    at ../src/test/x86/../util.h:189
#4  0x000055f4d22df70e in main () at ../src/test/x86/pkeys.c:69

@rocallahan
Copy link
Collaborator

What are the values of initial_pkru, pkey and modified_pkru under the emergency debugger during replay, and in a successful run outside of rr but under gdb, stoping with a breakpoint at line 69?

@gcp
Copy link
Contributor Author

gcp commented Dec 23, 2021

morbo@alder:~/git/rr/build$ gdb ./bin/pkeys
GNU gdb (Ubuntu 11.1-0ubuntu2) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bin/pkeys...
(gdb) break pkeys.c:69
Breakpoint 1 at 0x16df: file ../src/test/x86/pkeys.c, line 69.
(gdb) run
Starting program: /home/morbo/git/rr/build/bin/pkeys 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, main () at ../src/test/x86/pkeys.c:69
69        test_assert((initial_pkru & ~(PKEY_DISABLE_ACCESS << (2 * pkey ))) == modified_pkru);
(gdb) print initial_pkru 
$1 = 1431655764
(gdb) print pkey
$2 = 1
(gdb) print modified_pkru 
$3 = 1431655760

@gcp
Copy link
Contributor Author

gcp commented Dec 23, 2021

morbo@alder:~/git/rr/build$ gdb '-l' '10000' '-ex' 'set sysroot /' '-ex' 'target extended-remote 127.0.0.1:19941' /home/morbo/.local/share/rr/pkeys-3/mmap_hardlink_4_pkeys
GNU gdb (Ubuntu 11.1-0ubuntu2) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/morbo/.local/share/rr/pkeys-3/mmap_hardlink_4_pkeys...
Remote debugging using 127.0.0.1:19941
Reading symbols from /usr/local/bin/../lib/rr/librrpreload.so...
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug/.build-id/b8/037b6260865346802321dd2256b8ad1d857e63.debug...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug/.build-id/14/acb10bbdaefc6a64890c96417426ca820c0faa.debug...
BFD: warning: system-supplied DSO at 0x6fffd000 has a section extending past end of file
0x00007f5e96cc79b5 in __GI___libc_write (fd=1, buf=0x7ffc611a7d30, nbytes=26)
    at ../sysdeps/unix/sysv/linux/write.c:26
26      ../sysdeps/unix/sysv/linux/write.c: No such file or directory.
(gdb) bt
#0  0x00007f5e96cc79b5 in __GI___libc_write (fd=1, buf=0x7ffc611a7d30, nbytes=26)
    at ../sysdeps/unix/sysv/linux/write.c:26
#1  0x00005623c5bdd386 in atomic_printf (fmt=0x5623c5bde00c "FAILED: errno=%d (%s)\n")
    at ../src/test/x86/../util.h:165
#2  0x00005623c5bdd40b in check_cond (cond=0) at ../src/test/x86/../util.h:183
#3  0x00005623c5bdd42d in atomic_assert (cond=0, 
    str=0x5623c5bde108 "(initial_pkru & ~(PKEY_DISABLE_ACCESS << (2 * pkey ))) == modified_pkru")
    at ../src/test/x86/../util.h:189
#4  0x00005623c5bdd70e in main () at ../src/test/x86/pkeys.c:69
(gdb) up
#1  0x00005623c5bdd386 in atomic_printf (fmt=0x5623c5bde00c "FAILED: errno=%d (%s)\n")
    at ../src/test/x86/../util.h:165
165       return write(STDOUT_FILENO, buf, len);
(gdb) up
#2  0x00005623c5bdd40b in check_cond (cond=0) at ../src/test/x86/../util.h:183
183         atomic_printf("FAILED: errno=%d (%s)\n", errno, strerror(errno));
(gdb) up
#3  0x00005623c5bdd42d in atomic_assert (cond=0, 
    str=0x5623c5bde108 "(initial_pkru & ~(PKEY_DISABLE_ACCESS << (2 * pkey ))) == modified_pkru")
    at ../src/test/x86/../util.h:189
189       if (!check_cond(cond)) {
(gdb) up
#4  0x00005623c5bdd70e in main () at ../src/test/x86/pkeys.c:69
69        test_assert((initial_pkru & ~(PKEY_DISABLE_ACCESS << (2 * pkey ))) == modified_pkru);
(gdb) print initial_pkru 
$1 = 1431655764
(gdb) print pkey
$2 = 1
(gdb) print modified_pkru 
$3 = 1431655764

@gcp gcp reopened this Dec 23, 2021
@gcp
Copy link
Contributor Author

gcp commented Dec 23, 2021

Looks like some of the fixes in here already landed in f4529da through #3014.

@rocallahan
Copy link
Collaborator

What's the kernel version?

@rocallahan
Copy link
Collaborator

I don't understand what could be going on here. The test passed on AWS c5d.9xlarge, which behaves as expected: after the pkey_alloc syscall, the kernel has changed the PKRU register and we record that in ExtraRegisters in the trace, and we restore those registers from the trace during replay. So I wonder if kernel behavior has changed somehow so that the PKRU changes aren't visible to rr for some reason.

@rocallahan rocallahan merged commit a6eddf9 into rr-debugger:master Dec 23, 2021
@rocallahan
Copy link
Collaborator

I think it makes sense to just merge this so I've don so. Please file a new issue for the pkeys test failure.

@gcp
Copy link
Contributor Author

gcp commented Dec 24, 2021

What's the kernel version?

5.14.21

@Manouchehri Manouchehri mentioned this pull request Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants