Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSv does not support 5-level paging #1292

Open
Gege-Wang opened this issue Jan 9, 2024 · 10 comments
Open

OSv does not support 5-level paging #1292

Gege-Wang opened this issue Jan 9, 2024 · 10 comments

Comments

@Gege-Wang
Copy link

Assertion failed: mmu::phys_bits <= mmu::max_phys_bits (arch/x64/arch-setup.cc: arch_setup_free_memory: 140)
Halting.
QEMU: Terminated

when I run ./scripts/run.py, I meet this problem. I use Qemu-6.2.0, I also tried Qemu-7.0.0, this problem still happened.

@wkozaczuk
Copy link
Collaborator

Hi,

Can you add more specifics like those:

  • host version (fedora, ubuntu, etc), gcc version
  • full command to build the image (aka scripts/build image=??? ...)
  • run command (aka scripts/run.py ???) and output of it if you add --dry-run
  • full error message
  • which version of OSv

Thanks,
Waldek

@Gege-Wang
Copy link
Author

Ok,

Here are some specifics,

  • host version "Ubuntu 22.04.2 LTS"
  • gcc version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
  • build command sudo ./scripts/build -j4 fs=rofs image=native-example and try some command in readme.
  • run command ./script/run.py
miyamo@wark-i1:~/osv$ sudo ./scripts/run.py --dry-run
/home/miyamo/osv/scripts/../scripts/imgedit.py setargs /home/miyamo/osv/build/last/usr.img "--rootfs=rofs /hello"
qemu-system-x86_64 \
-m 2G \
-smp 4 \
-vnc :1 \
-gdb tcp::1234,server,nowait \
-device virtio-blk-pci,id=blk0,drive=hd0,scsi=off,bootindex=0 \
-drive file=/home/miyamo/osv/build/last/usr.img,if=none,id=hd0,cache=none,aio=native \
-netdev user,id=un0,net=192.168.122.0/24,host=192.168.122.1 \
-device virtio-net-pci,netdev=un0 \
-device virtio-rng-pci \
-enable-kvm \
-cpu host,+x2apic \
-chardev stdio,mux=on,id=stdio,signal=off \
-mon chardev=stdio,mode=readline \
-device isa-serial,chardev=stdio
  • full error message
OSv v0.57.0-122-gfb7ab251
Assertion failed: mmu::phys_bits <= mmu::max_phys_bits (arch/x64/arch-setup.cc: arch_setup_free_memory: 140)
Halting.
  • OSv version
    I use this master branch.

Thanks,
miyamo

@wkozaczuk
Copy link
Collaborator

Thanks. Is there a reason you are running with sudo?

Also can you connect with the debugger and grab full stack trace?

@wkozaczuk
Copy link
Collaborator

wkozaczuk commented Jan 10, 2024

Ok, I think I know where it crashes in arch-setup.cc (line 140):

135     auto c = processor::cpuid(0x80000000);
136     if (c.a >= 0x80000008) {
137         c = processor::cpuid(0x80000008);
138         mmu::phys_bits = c.a & 0xff;
139         mmu::virt_bits = (c.a >> 8) & 0xff;
140         assert(mmu::phys_bits <= mmu::max_phys_bits);
141     }

Is your host a physical machine or is it a nested virtualization example? Also, is it an Intel or AMD machine?

Can you run lscpu and specify the output of it here?

@Gege-Wang
Copy link
Author

Hi,

I use sudo because I make a new user in the machine ,if not ,there is a permission denied fault.

My host is a physical machine, there are some extra information.

$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 57 bits virtual
  Byte Order:            Little Endian
CPU(s):                  24
  On-line CPU(s) list:   0-23
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) Silver 4410Y
    CPU family:          6
    Model:               143
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            8
    BogoMIPS:            4000.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep m
                         trr pge mca cmov pat pse36 clflush dts acpi m
                         mx fxsr sse sse2 ss ht tm pbe syscall nx pdpe
                         1gb rdtscp lm constant_tsc art arch_perfmon p
                         ebs bts rep_good nopl xtopology nonstop_tsc c
                         puid aperfmperf tsc_known_freq pni pclmulqdq 
                         dtes64 monitor ds_cpl vmx smx est tm2 ssse3 s
                         dbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2
                          x2apic movbe popcnt tsc_deadline_timer aes x
                         save avx f16c rdrand lahf_lm abm 3dnowprefetc
                         h cpuid_fault epb cat_l3 cat_l2 cdp_l3 invpci
                         d_single cdp_l2 ssbd mba ibrs ibpb stibp ibrs
                         _enhanced tpr_shadow vnmi flexpriority ept vp
                         id ept_ad fsgsbase tsc_adjust sgx bmi1 avx2 s
                         mep bmi2 erms invpcid cqm rdt_a avx512f avx51
                         2dq rdseed adx smap avx512ifma clflushopt clw
                         b intel_pt avx512cd sha_ni avx512bw avx512vl 
                         xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_oc
                         cup_llc cqm_mbm_total cqm_mbm_local split_loc
                         k_detect avx_vnni avx512_bf16 wbnoinvd dtherm
                          ida arat pln pts hfi avx512vbmi umip pku osp
                         ke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq 
                         avx512_vnni avx512_bitalg tme avx512_vpopcntd
                         q la57 rdpid bus_lock_detect cldemote movdiri
                          movdir64b enqcmd sgx_lc fsrm md_clear serial
                         ize tsxldtrk pconfig arch_lbr ibt amx_bf16 av
                         x512_fp16 amx_tile amx_int8 flush_l1d arch_ca
                         pabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   576 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    24 MiB (12 instances)
  L3:                    30 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-23
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled
                          via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __us
                         er pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, 
                         RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Thanks,
miyamo

@Gege-Wang
Copy link
Author

$ systool -m kvm_intel -v | grep -i nested
nested = "Y"
nested_early_check = "N"
$ cat /sys/module/kvm_intel/parameters/nested
Y

@wkozaczuk
Copy link
Collaborator

Thanks for the info. This line is essential:

 Address sizes:         46 bits physical, 57 bits virtual

I think it indicates your host comes with 5-level paging (57 bits virtual space) vs 4-level paging (48 bits) and I do not think OSv supports 5-level paging yet.

Is there a way to have qemu simulate 4-level paging?

@wkozaczuk
Copy link
Collaborator

Another possibility would be to have OSv detect 5-level paging but somehow only use the 4 levels of it. We would have to see if that is possible and how this would work.

Also, I know there are some apps (JIT and like) that use the upper 16 bits of the 64-bits virtual addresses to encode some information.

@wkozaczuk wkozaczuk changed the title Assertion failed: mmu::phys_bits <= mmu::max_phys_bits OSv does not support 5-level paging Jan 10, 2024
@wkozaczuk
Copy link
Collaborator

Based on this article it seems you can disable 5-level paging in the host.

@Gege-Wang
Copy link
Author

I try it in another virtual machine, whose specifics are as follow:

架构:                              x86_64
CPU 运行模式:                      32-bit, 64-bit
字节序:                            Little Endian
Address sizes:                      46 bits physical, 48 bits virtual
CPU:                                8
在线 CPU 列表:                     0-7
每个核的线程数:                    1
每个座的核数:                      1
座:                                8
NUMA 节点:                         1
厂商 ID:                           GenuineIntel
CPU 系列:                          6
型号:                              85
型号名称:                          Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
步进:                              4
CPU MHz:                           2499.998
BogoMIPS:                          4999.99
虚拟化:                            VT-x
超管理器厂商:                      KVM
虚拟化类型:                        完全
L1d 缓存:                          256 KiB
L1i 缓存:                          256 KiB
L2 缓存:                           32 MiB
L3 缓存:                           128 MiB
NUMA 节点0 CPU:                    0-7
Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status
Vulnerability Itlb multihit:        KVM: Mitigation: VMX disabled
Vulnerability L1tf:                 Mitigation; PTE Inversion; VMX flush not necessary, SMT disabled
Vulnerability Mds:                  Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown:             Vulnerable
Vulnerability Mmio stale data:      Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Retbleed:             Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; IBRS, IBPB conditional, STIBP disabled, RSB filling, PBRSB-eIBRS Not affe
                                    cted
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Mitigation; Clear CPU buffers; SMT Host state unknown
标记:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx f
                                    xsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl 
                                    xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2
                                    apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm
                                     3dnowprefetch invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept v
                                    pid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f a
                                    vx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xg
                                    etbv1 arat pku ospke md_clear arch_capabilities

This virtual also enable 5-level page table, but it doesn't meet the problem. How would this happen?

Thanks,
miyamo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants