-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
64-bit boot stub? #579
Comments
Oh, and it might turn out useful if the boot stub code could be included in e.g. Qemu for RPi emulation - both the existing 32-bit version and any new 64-bit version. Do you have any problems with my doing that, even for the existing 32-bit binary that you wrote? I wonder if you could publish the source under an open license to help the contribution chain to Qemu. |
No problem with releasing the boot stub code. See: http://pastebin.com/raw/stxjiVVD Just let us know what addresses to modify. Currently we do:
But we could do something different in 64-bit mode. |
Hmm, it seems to work without issues, but at least according to the A53 TRM (p.328) the order of enabling the caches and SMP is wrong:
|
Writing to the SCR doesn't actually enable the icache/dache. To enable the cache, the MMU must also be configured/enabled, and the "ARM boot stub" doesn't do that (Linux/U-Boot/... does it during early boot). |
Great, thanks! Do you know why is the SCR code in the stub at all? I guess the next stop is the kernel entry point here
What about the IIRC VC/ARM shared L2 cache? Is it always enabled? I couldn't find anything special in the kernel code. |
Correction: System Control Register is SCTLR, whereas SCR is Secure Configuration Register. I think the write to the SCTLR in the stub is not necessary; At least the arch/arm64 kernel repeats this and I'm sure the arch/arm kernel must too; see setup of x0 here: I suspect the comment in the kernel code refers to the logical state of the caches being disabled. The fact this is implemented by writing bits in the SCTLR is true but simply not explicitly mentioned. I don't know if the L2 cache can be enabled/disabled, but IIUC all control over that cache is from the VideoCore FW. I believe its state is irrelevant to ARM-based SW, except for interaction with peripherals doing DMA, since it's beyond the point of coherency/unification for the ARM cores themselves. As such I would not expect to find any code that controls it. |
A first cut is available at https://github.com/swarren/rpi-3-aarch64-demo in armstub64.S. The most-recent-but-one includes a TODO list describing the outstanding changes. |
I have pushed what I hope is a final version of the 64-bit stub to that github repo. The source is in armstub64.S. I have tested this with U-Boot, and the test app in the same repo. I have not tested it with any Linux kernel yet, nor have I attempted to boot any Linux kernel from U-Boot. I believe @anholt will test it some time. The DTB address should be written as a 64-bit value to address 0xf8. The stub implements the standard RAM-based "spin table" method of SMP secondary CPU boot. This is a mechanism already supported by the mainline Linux kernel's arch/arm64 port, so should ease upstreaming kernel support; the arch/arm64 maintainers are unlikely to welcome a new SoC-specific SMP mechanism (besides spin table or PSCI) that uses the bcm283x mailbox registers. I'd like to suggest creating a new git repo that contains the ARM stub source for all the Pis. I'd be happy to contribute patches to such a repo to fix the issues I filed with the 32-bit stubs if there was somewhere I could send a regular git patch against. I'd also be happy to create the initial content of such a repo if you want; just let me know (assuming the pastebin you linked above is all the versions; otherwise it might be best if you filled in the existing 32-bit stub code first). |
Do I need any custom version of boot-firmware for this or is the HEAD of github.com/raspberrypi/firmware master branch supposed to work? Somehow this is not working for me, I did try to put prints at the beginning of armstub64.S as well, but it didn't help. This might be worth documenting in the README as well as the firmwares shipped with 2016-03-18-raspbian-jessie don't respond to 'enable_uart=250'. BTW, I really hope that when this stub is integrated to VC FWs it is picked from a bin file as Stephen suggested |
I'm happy to add support to load: at address zero (up to 0x100 bytes). If Also the presence of kernel8.img will enable 64-bit mode. |
@sukantoghosh I tested with 046effa "firmware: arm_loader: emmc clock depends on core clock See: #572", which was HEAD of github.com/raspberrypi/firmware master branch when I last fetched a few days ago. |
@popcornmix that all sounds good to me, thanks! Just one query though: The stub code you posted had an ifdef for BCM2810-vs-not, so I think you'd end needing 4 FW files: ARMv6, ARMv7, ARMv8 32-bit mode, and ARMv8 64-bit mode, or am I misinterpreting something? |
@sukantoghosh that should be enable_uart=1 not =250. I'll fix my repo in a minute. |
Yes, you are right, the stub code is slightly different between Pi2 and Pi3 so we should have an extra stub file. |
thanks @swarren it works for me |
With @swarren's boot stub I have SMP enabled on 64-bit Linux, and I'm making it to the point of trying to NFS root mount before it fails because USB device probing isn't working. On 2836, the boot stub was setting a 19.2Mhz CNTFREQ but not LOCAL_CONTROL/PRESCALER, so we set the LOCAL_* regs early in Linux boot. For this boot stub, I've disabled the LOCAL_* setup so that we run with the 1Mhz it sets in CNTFREQ. I'm wondering, though: Don't we want to run the timer at 19.2Mhz? If so, should the firmware/boot stub take over that responsibility, if it was already setting CNTFREQ previously? |
BTW, there's some discussion re: DWC2 and broken DMA support in the kernel on page 3 (and later) of https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=137963&start=50. I'm not sure if that's talking about the RPi Foundation kernel or mainline. Where (which module) are the LOCAL_CONTROL/PRESCALER registers; I'm not familiar with those. |
Other difference that I spotted during my experiments (and I guess the http://lxr.free-electrons.com/source/arch/arm64/Makefile#L53 TEXT_OFFSET := 0x0008_0000 # 512K The image should be idealy loaded at adress taken from the image
|
I've added the arm stubs here: raspberrypi/tools@920c7ed Here is a (not hugely tested) firmware: If you have a |
drivers/irqchip/irq-bcm2836.c |
@popcornmix I've tested firmware_armstub.zip and it works great, thanks! I tested:
I didn't use mkknlimg in any of my testing. I should probably switch to that soon:-) |
@capnm yes it looks like the FW and 64-bit ARM stub can't just hard-code the kernel address; it must be read from the kernel image header. I'll prepare a patch for the stub that:
|
@popcornmix I've also tested using custom ARM stubs (only on RPi3 in 64-bit mode). If I use config.txt option armstub=armstub8.bin, it works great. However, if I don't put any armstub= option in config.txt, the FW doesn't seem to automatically check for and load armstub8.bin (even though I put the "kernel" in kernel8.img). I thought from your description it would? |
Yes, your expectations are correct. I'll push an updated firmware later today that fixes a few issues. Hopefully that will work. Currently, not specifying |
@popcornmix my inclination is to keep things simple. Have the FW decide whether to boot in 32- or 64-bit mode solely based on the kernel image filename, and don't take the stub filename into account. That's because the kernel is the primary piece of SW, and the stub is somewhat only there to support the kernel. Once that bit-size determination is made, use that to select the default stub filename. This way, the FW doesn't have to have rules to decide whether the bit size implied by the kernel filename overrides the bit size implied by the stub filename, or the other way around. If someone wants to do something really unsusual and run a 32-bit OS (kernel) under a 64-bit secure monitor/OS, I assume they'd make that work by putting the secure monitor into kernel8.img or setting arm_control=0x200 (thus ensuring 64-bit booting), using a custom stub file (ths avoiding the built-in switch to HYP/EL2), and having the secure OS load the real kernel image (or perhaps concatenating it with the secure OS image). Hence, even with this simple scheme, doing something unusual is still possible. I envisage something like:
|
@anholt the timer frequency setup is implemented in raspberrypi/tools#54, and tested w/ U-Boot's sleep command. |
@swarren there are a number of "bare metal" users who just use a single block of arm code. |
Here is latest test firmware. It includes latest armstub8.S from github: It is useful for the firmware to provide some information. Currently: Currently device_tree_address is (uint32_t *)0x34 for Pi1 and (uint32_t *)0x14 for Pi2/Pi3. armstub8.S also uses (uint64_t *)0xd0 for kernel_entry As armstub7.S is quite cramped in 0x100 bytes currently (we may be able to increase the size, but that could break some bare metal assumptions), using 0xd0 might be awkward. Can we define an API for where to put these words? It seems 32-bit values will be enough as bus addresses are still 32-bits even in 64-bit mode. Using unused vectors (like 0x14) is convenient for 32-bit mode. Can we find suitable values that work for both 32-bit and 64-bit stubs? We could always use different values for 32-bit and 64-bit, but if making them the same is possible, it simplifies things. |
device_tree_end shouldn't be needed, since the dtb contains a length and any serious OS will unflatten it into a more convenient form ASAP. |
armstub8.bin only |
So there are no |
I've had some stale files, sorry for the false alarm. A fresh checkout and suddenly it worked again. I tested the essential ARM64 relevant cases, everything LGTM. Now it's time for useful development work. Thanks @swarren @popcornmix @pelwell ! tested (armstub8.bin, rm kernel8* config.txt)
stub without magic no change, ok 0x00000108 -> 0x00000100 .. 0x00008000 0xffffffff -> 0x0, ok
|
Well, nothing is going to help if you delete the kernel accidentally. If you delete it deliberately, then having to update the stub as well is going to be a bit annoying.
Well, it could perhaps flash the LED on the board, but that's about all it could do. I wouldn't expect noticing that the kernel file is missing to take particularly long, but perhaps that's simply because I have scripts that copy everything to the SD card, so I'd probably just re-run them, find it works, be puzzled and move on. @pelwell the 0xffffffff changes make sense to me. I tested the latest FW with both rpi-3-aarch64-demo and U-Boot with just a kernel8.img and no armstub8.bin and no config.txt options to force 64-bit and everything worked as expected. Since everything is basically working fine w.r.t. armstub8.bin, I'm closing this bug. If problems are found, we can open new bugs for those specific changes. Thanks! |
@pelwell Setting Right now the only solution I have is to have a jump to 0x100000 as the first instruction of my kernel, which I have padded with zeros between 8004 and 100000: /usr/bin/printf "\001\366\240\343" > first_commands
dd of=kernel7.img bs=1m count=4 if=/dev/zero
dd of=kernel7.img bs=1 conv=notrunc if=first_commands
dd of=kernel7.img bs=32k oseek=31 conv=notrunc if=kernel.bin |
@SylvainGarrigues No, that is not what is intended, and it isn't immediately obvious how that could be happening. Are you using an external stub? Is the stub being patched, and if so with what? |
@pelwell I am not using any stub, and my config.txt just reads:
Padding the kernel with 0 like I mentioned earlier works with this config.txt. If I add the line |
That's not my experience. With this config.txt:
and after zeroing the first 2MB of memory (not a requirement, just to confirm that it isn't picking up anything from the previous attempt), it boots. This is with a standard 32-bit compressed kernel built to run at 0x8000, so the decompressor must be decompressing from 0x100000 back to 0x8000 (the memory at 0x8000 once the kernel is running looks like the start of a normal kernel). |
And to remove any doubt, I see the kernel being loaded to 0x100000 as expected. All modifications to the internal kernel_address variable, apart from the initialisation from config.txt, are conditional on it being non-zero, so I don't see how this can be failing for you. |
FWIW, yes the 32-bit ARM zImage decompressor is position-independent and so will always decompress to 0x8000 (or wherever the Image is linked, I suppose) no matter where it's loaded. Well, IIRC the decompressor must be loaded within the first 128MiB of RAM for AUTO_ZRELADDR to work correctly. As an aside, when using zImage, 0x8000 (or thereabouts) is actually about the worst place the kernel could be loaded, since the decompressor must first copy itself somewhere else and then perform decompression, so that the decompressed data doesn't over-write the compressed data during decompression. If the zImage was loaded elsewhere (say, 32MB into RAM) that initial memcpy wouldn't be needed. Still, too late to change that default I suspect. |
I am sorry, I was mistaken, I had a serial problem. My kernel is a raw binary (therefore neither an ELF nor a zImage) and it appears my kernel is loaded at the address specified by Now I can boot FreeBSD with just two lines in config.txt:
So cool! No need for mkknlimg anymore, no need for u-boot either, no need for a stub, no need for kernel_old. Thanks! |
kernel: bcm2835_thermal: Don't report unsupported trip type kernel: scripts/dtc: Only emit local fixups for overlays kernel: bcm2835: do not require substream for accessing chmap ctl kernel: bcm2835: add fallback channel layouts if channel map API is not used kernel: bcm2835: log which channel map is set See: raspberrypi/linux#1257 firmware: armstubs: Zero kernel and DTB addresses to match external stubs firmware: arm_dt: If the trailer exists, ignore device_tree= firmware: arm_loader: Load DTB high if insufficient space See: #579 firmware: dtoverlay: Refactor applying overlays to permit snooping firmware: dtoverlay: Add dtparam command firmware: dtmerge: Don't crash if the overlay fails to load firmware: dtoverlay: Integer overrides can create and extend properties See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=139732 firmware: platform: Don't overwrite disable_pvt=2 firmware: board_info: CM3 has no Bluetooth firmware: dispmanx: Remove ifdefs for obsolete platforms firmware: vc_image: Remove ifdefs for obsolete platforms firmware: vc_audio: Remove ifdefs for obsolete platforms firmware: vcfw: Remove ifdefs for obsolete platforms firmware: scalerlib: testing: treat HVS dst_x parameter as 11-bit unsigned
kernel: bcm2835_thermal: Don't report unsupported trip type kernel: scripts/dtc: Only emit local fixups for overlays kernel: bcm2835: do not require substream for accessing chmap ctl kernel: bcm2835: add fallback channel layouts if channel map API is not used kernel: bcm2835: log which channel map is set See: raspberrypi/linux#1257 firmware: armstubs: Zero kernel and DTB addresses to match external stubs firmware: arm_dt: If the trailer exists, ignore device_tree= firmware: arm_loader: Load DTB high if insufficient space See: raspberrypi/firmware#579 firmware: dtoverlay: Refactor applying overlays to permit snooping firmware: dtoverlay: Add dtparam command firmware: dtmerge: Don't crash if the overlay fails to load firmware: dtoverlay: Integer overrides can create and extend properties See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=139732 firmware: platform: Don't overwrite disable_pvt=2 firmware: board_info: CM3 has no Bluetooth firmware: dispmanx: Remove ifdefs for obsolete platforms firmware: vc_image: Remove ifdefs for obsolete platforms firmware: vc_audio: Remove ifdefs for obsolete platforms firmware: vcfw: Remove ifdefs for obsolete platforms firmware: scalerlib: testing: treat HVS dst_x parameter as 11-bit unsigned
@SylvainGarrigues
scp arch/arm64/boot/Image $rpi3/kernel8.img |
See: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=136445 firmware: IL ISP: Correct RGB to YUV matrices, and ignore code side info firmware: MJPEG encode: Handle stereoscopic images See: https://www.raspberrypi.org/forums/viewtopic.php?f=43&t=138325&p=918041 firmware: IL Camera: Change unspecified colour space to being JFIF See: raspberrypi/userland#78 firmware: OV5647: Option to configure auto lens shading to use potential fix firmware: arm_loader: Factor out DT support into arm_dt See: raspberrypi/linux#1394 firmware: arm_ldconfig: Switch to using arm stubs generated from tools/mkimage firmware: arm_ldconfig: Support loading arm stubs from file See: raspberrypi#579
firmware: config: Add arm_64bit setting firmware: arm_ldconfig: Set kernel_address for 64-bit boot See: raspberrypi#579
kernel: Add Support for BoomBerry Audio boards See: raspberrypi/linux#1397 kernel: Add support for the Digital Dreamtime Akkordion music player See: raspberrypi/linux#1406 kernel: Add support for mcp7940x family of RTC See: raspberrypi/linux#1397 firmware: vcilcs: Warn as message queue approaches fullness See: raspberrypi#449 firmware: dtoverlay: Copy overrides before applying firmware: dtmerge: Pack the merged DTB before writing firmware: arm_ldconfig: Fix detection of kernel8.img firmware: arm_loader: Enable DT by default, read addresses back from stub See: raspberrypi#579 firmware: ldconfig: Add [none] section as a convenience as config.txt filter firmware: pwm_sdm: Bugfixes See: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=136445 firmware: gencmd: Add command to read current and historical throttled state
kernel: bcm2835_thermal: Don't report unsupported trip type kernel: scripts/dtc: Only emit local fixups for overlays kernel: bcm2835: do not require substream for accessing chmap ctl kernel: bcm2835: add fallback channel layouts if channel map API is not used kernel: bcm2835: log which channel map is set See: raspberrypi/linux#1257 firmware: armstubs: Zero kernel and DTB addresses to match external stubs firmware: arm_dt: If the trailer exists, ignore device_tree= firmware: arm_loader: Load DTB high if insufficient space See: raspberrypi#579 firmware: dtoverlay: Refactor applying overlays to permit snooping firmware: dtoverlay: Add dtparam command firmware: dtmerge: Don't crash if the overlay fails to load firmware: dtoverlay: Integer overrides can create and extend properties See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=139732 firmware: platform: Don't overwrite disable_pvt=2 firmware: board_info: CM3 has no Bluetooth firmware: dispmanx: Remove ifdefs for obsolete platforms firmware: vc_image: Remove ifdefs for obsolete platforms firmware: vc_audio: Remove ifdefs for obsolete platforms firmware: vcfw: Remove ifdefs for obsolete platforms firmware: scalerlib: testing: treat HVS dst_x parameter as 11-bit unsigned
The values come from @pelwell, who works on the boot loader: raspberrypi/firmware#579 (comment)
See: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=136445 firmware: IL ISP: Correct RGB to YUV matrices, and ignore code side info firmware: MJPEG encode: Handle stereoscopic images See: https://www.raspberrypi.org/forums/viewtopic.php?f=43&t=138325&p=918041 firmware: IL Camera: Change unspecified colour space to being JFIF See: raspberrypi/userland#78 firmware: OV5647: Option to configure auto lens shading to use potential fix firmware: arm_loader: Factor out DT support into arm_dt See: raspberrypi/linux#1394 firmware: arm_ldconfig: Switch to using arm stubs generated from tools/mkimage firmware: arm_ldconfig: Support loading arm stubs from file See: raspberrypi#579
firmware: config: Add arm_64bit setting firmware: arm_ldconfig: Set kernel_address for 64-bit boot See: raspberrypi#579
kernel: Add Support for BoomBerry Audio boards See: raspberrypi/linux#1397 kernel: Add support for the Digital Dreamtime Akkordion music player See: raspberrypi/linux#1406 kernel: Add support for mcp7940x family of RTC See: raspberrypi/linux#1397 firmware: vcilcs: Warn as message queue approaches fullness See: raspberrypi#449 firmware: dtoverlay: Copy overrides before applying firmware: dtmerge: Pack the merged DTB before writing firmware: arm_ldconfig: Fix detection of kernel8.img firmware: arm_loader: Enable DT by default, read addresses back from stub See: raspberrypi#579 firmware: ldconfig: Add [none] section as a convenience as config.txt filter firmware: pwm_sdm: Bugfixes See: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=136445 firmware: gencmd: Add command to read current and historical throttled state
kernel: bcm2835_thermal: Don't report unsupported trip type kernel: scripts/dtc: Only emit local fixups for overlays kernel: bcm2835: do not require substream for accessing chmap ctl kernel: bcm2835: add fallback channel layouts if channel map API is not used kernel: bcm2835: log which channel map is set See: raspberrypi/linux#1257 firmware: armstubs: Zero kernel and DTB addresses to match external stubs firmware: arm_dt: If the trailer exists, ignore device_tree= firmware: arm_loader: Load DTB high if insufficient space See: raspberrypi#579 firmware: dtoverlay: Refactor applying overlays to permit snooping firmware: dtoverlay: Add dtparam command firmware: dtmerge: Don't crash if the overlay fails to load firmware: dtoverlay: Integer overrides can create and extend properties See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=139732 firmware: platform: Don't overwrite disable_pvt=2 firmware: board_info: CM3 has no Bluetooth firmware: dispmanx: Remove ifdefs for obsolete platforms firmware: vc_image: Remove ifdefs for obsolete platforms firmware: vc_audio: Remove ifdefs for obsolete platforms firmware: vcfw: Remove ifdefs for obsolete platforms firmware: scalerlib: testing: treat HVS dst_x parameter as 11-bit unsigned
If I write a "boot stub" for the VC FW to place into memory at address 0 in 64-bit mode, will you accept it into the VC FW? If so:
a) What license do you want it under? I'd be happy to make it MIT/BSD/X11 or assign copyright to the Pi Foundation; whatever is easiest for you.
b) Which address(es) does the VC FW write to? I assume at least 0x14 for the ATAGS/DTB pointer, but I'm not sure whether the initialization values for cntfrq or the Linux machine ID are written too. I'm discounting address 0x100 and on since that's just data and not part of the boot stub itself.
(As an aside, if this stub could be loaded from a separate .bin file, that might be helpful since it'd allow easy experimentation, but that would then get into a discussion of how to name the files since we'd need different files for bcm2835, bcm2836, bcm2837 32-bit, bcm2837 64-bit).
The text was updated successfully, but these errors were encountered: