Skip to content
This repository has been archived by the owner on Aug 6, 2024. It is now read-only.

Suspend to RAM with mainline #3

Open
samueldr opened this issue Jan 17, 2020 · 38 comments
Open

Suspend to RAM with mainline #3

samueldr opened this issue Jan 17, 2020 · 38 comments
Labels
known-issue Something isn't working

Comments

@samueldr
Copy link
Owner

(Tracking issue...)

@samueldr samueldr added the known-issue Something isn't working label Jan 17, 2020
@samueldr samueldr changed the title Suspend to RAM with tsys' kernel Suspend to RAM with tsys' kernel tracking issue Jan 17, 2020
@theotherjimmy
Copy link

theotherjimmy commented Jan 29, 2020

Might be related: rockchip-linux/kernel@3cc3b03

EDIT: FYI, the files added in that commit are not present in tsys' kernel.

@samueldr
Copy link
Owner Author

Entirely plausible. Good find!

I wonder if that driver has a mainline patch open. Otherwise it looks relatively self-contained, wondering how hard it is to forward prat.

@theotherjimmy
Copy link

theotherjimmy commented Jan 29, 2020

I'm testing the blunt forward port right now. It did not apply cleanly, so I'm going to have to do something about that.

@theotherjimmy
Copy link

theotherjimmy commented Jan 30, 2020

Working (EDIT: as in "i'm working on it in this branch", not "this branch works") branch: https://github.com/theotherjimmy/wip-pinebook-pro/tree/sleep

@theotherjimmy
Copy link

@theotherjimmy
Copy link

I got suspend to ram working! I had no measurable charge loss over 4 hours of suspend. Logs to show that it happened (note the "deep"):

Feb 08 08:33:39 nixos kernel: PM: suspend entry (deep)
Feb 08 12:39:21 nixos kernel: Filesystems sync: 0.191 seconds
Feb 08 12:39:21 nixos kernel: dwmmc_rockchip fe310000.dwmmc: pre_suspend failed for non-removable host>
Feb 08 12:39:21 nixos kernel: Freezing user space processes ... (elapsed 0.002 seconds) done.
Feb 08 12:39:21 nixos kernel: OOM killer disabled.
Feb 08 12:39:21 nixos kernel: Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
Feb 08 12:39:21 nixos kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Feb 08 12:39:21 nixos kernel: Disabling non-boot CPUs ...
Feb 08 12:39:21 nixos kernel: CPU1: shutdown
Feb 08 12:39:21 nixos kernel: psci: CPU1 killed (polled 0 ms)
Feb 08 12:39:21 nixos kernel: CPU2: shutdown
Feb 08 12:39:21 nixos kernel: psci: CPU2 killed (polled 0 ms)
Feb 08 12:39:21 nixos kernel: CPU3: shutdown
Feb 08 12:39:21 nixos kernel: psci: CPU3 killed (polled 0 ms)
Feb 08 12:39:21 nixos kernel: CPU4: shutdown
Feb 08 12:39:21 nixos kernel: psci: CPU4 killed (polled 0 ms)
Feb 08 12:39:21 nixos kernel: CPU5: shutdown
Feb 08 12:39:21 nixos kernel: psci: CPU5 killed (polled 4 ms)
Feb 08 12:39:21 nixos kernel: Enabling non-boot CPUs ...
Feb 08 12:39:21 nixos kernel: Detected VIPT I-cache on CPU1
Feb 08 12:39:21 nixos kernel: GICv3: CPU1: found redistributor 1 region 0:0x00000000fef20000
Feb 08 12:39:21 nixos kernel: CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
Feb 08 12:39:21 nixos kernel: CPU1 is up
Feb 08 12:39:21 nixos kernel: Detected VIPT I-cache on CPU2
Feb 08 12:39:21 nixos kernel: GICv3: CPU2: found redistributor 2 region 0:0x00000000fef40000
Feb 08 12:39:21 nixos kernel: CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
Feb 08 12:39:21 nixos kernel: CPU2 is up
Feb 08 12:39:21 nixos kernel: Detected VIPT I-cache on CPU3
Feb 08 12:39:21 nixos kernel: GICv3: CPU3: found redistributor 3 region 0:0x00000000fef60000
Feb 08 12:39:21 nixos kernel: CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
Feb 08 12:39:21 nixos kernel: CPU3 is up
Feb 08 12:39:21 nixos kernel: Detected PIPT I-cache on CPU4
Feb 08 12:39:21 nixos kernel: GICv3: CPU4: found redistributor 100 region 0:0x00000000fef80000
Feb 08 12:39:21 nixos kernel: CPU4: Booted secondary processor 0x0000000100 [0x410fd082]
Feb 08 12:39:21 nixos kernel: CPU4 is up
Feb 08 12:39:21 nixos kernel: Detected PIPT I-cache on CPU5
Feb 08 12:39:21 nixos kernel: GICv3: CPU5: found redistributor 101 region 0:0x00000000fefa0000
Feb 08 12:39:21 nixos kernel: CPU5: Booted secondary processor 0x0000000101 [0x410fd082]
Feb 08 12:39:21 nixos kernel: CPU5 is up
Feb 08 12:39:21 nixos kernel: usb usb5: root hub lost power or was reset
Feb 08 12:39:21 nixos kernel: usb usb6: root hub lost power or was reset
Feb 08 12:39:21 nixos kernel: cdn-dp fec00000.dp: [drm:cdn_dp_pd_event_work [rockchipdrm]] Not connect>
Feb 08 12:39:21 nixos kernel: usb usb7: root hub lost power or was reset
Feb 08 12:39:21 nixos kernel: usb usb8: root hub lost power or was reset
Feb 08 12:39:21 nixos kernel: OOM killer enabled.
Feb 08 12:39:21 nixos kernel: Restarting tasks ... done.
Feb 08 12:39:21 nixos kernel: PM: suspend exit

@samueldr
Copy link
Owner Author

samueldr commented Feb 8, 2020

Now, I impatiently wait for the changes :).

@theotherjimmy
Copy link

Seems that my branch was old, and I have amended the series yet again. I'm going to push to a different branch, after rebasing with the latest master.

@xantoz
Copy link

xantoz commented May 25, 2020

From what I hear one must use the BSP, rather than mainline, u-boot for this to work.
Even with the mainline/manjaro kernel

Manjaro has got it working that way.

@samueldr
Copy link
Owner Author

@xantoz yes, you're right, see #7.

@theotherjimmy
Copy link

Long time no progress. I finally have an automated reproducer for the suspend issues, using levinboot. This makes testing any fixes a much quicker process. I've now confirmed that TF-A enters the suspend state correctly (as far as I can tell) and that no external input can wake it. I have to track down how to enable a method for an external wake event (including at least the power button, as I can't easily test the lid switch with the back of my pbp) and confirm wakeup after that.

@theotherjimmy
Copy link

I suppose I should update this. Note that I have not worked on this in about 2 months (maybe more). I have managed to configure a wakeup source in TF-A, and, with much help from crystalgamma, was able to get LPDDR4 resume on the right track. However, the current issues is that returning from the enable-mmu code in TF-A raises an unhandled exception by returning to an unmapped address. I recently had my first child, so I may not be returning to this for a bit 😅 If someone wants to take up the torch, I can provide my patch series, but I'm hesitant to post it publicly.

@tgunnoe
Copy link

tgunnoe commented Nov 20, 2020

Is this issue referring to the battery drain I get when shutting "suspending" the laptop? Usually it never seems to powersave

@theotherjimmy
Copy link

@tgunnoe, Yes. the drain at the moment goes from 100% (ish) to 0% in < 36 hours. With suspend support in upstream TF-A, the power drain could be minimized to allow up to 20 days of suspend, or something like that.

@shadowrylander
Copy link

Any updates on this? I'm looking to get a Pinebook Pro this month or so, and was wondering if I could help in any way!

@theotherjimmy
Copy link

Right, so I finally got around to hooking up a debugger to my PBP this past week. 🤞 I'll be able to push the patch upstream soon.

@theotherjimmy
Copy link

It's published: https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/9616

@shadowrylander
Copy link

@theotherjimmy Wait; so how do we use this, again...? Sorry, a bit new to this!

@samueldr
Copy link
Owner Author

samueldr commented Apr 13, 2021

(Assuming NixOS), you would apply the patch to the TF-A for RK3399.

For example, you could add it to the override used here:

atf = armTrustedFirmwareRK3399.overrideAttrs(oldAttrs: {
src = fetchFromGitHub {
owner = "ARM-software";
repo = "arm-trusted-firmware";
rev = "9935047b2086faa3bf3ccf0b95a76510eb5a160b";
sha256 = "1a6pm0nbgm5r3a41nwlkrli90l2blcijb02li7h75xcri6rb7frk";
};
version = "2020-06-17";
});

Not sure if there is a minimum version of TF-A that needs to be used.

This, in turn, will be used by U-Boot. So you'll need to build and update U-Boot accordingly for your setup.

@theotherjimmy
Copy link

I developed the patch based on a branch off of pre-2.3. It should apply cleanly to anything starting 2.3 onward.

@shadowrylander
Copy link

@samueldr So override U-Boot with the patch, set it up, and build?

@theotherjimmy
Copy link

@shadowrylander This is a patch for TF-A, not uboot.

@shadowrylander
Copy link

(Assuming NixOS), you would apply the patch to the TF-A for RK3399.

For example, you could add it to the override used here:

atf = armTrustedFirmwareRK3399.overrideAttrs(oldAttrs: {
src = fetchFromGitHub {
owner = "ARM-software";
repo = "arm-trusted-firmware";
rev = "9935047b2086faa3bf3ccf0b95a76510eb5a160b";
sha256 = "1a6pm0nbgm5r3a41nwlkrli90l2blcijb02li7h75xcri6rb7frk";
};
version = "2020-06-17";
});

Not sure if there is a minimum version of TF-A that needs to be used.

This, in turn, will be used by U-Boot. So you'll need to build and update U-Boot accordingly for your setup.

So where would I apply the patch in the link provided here...?

@shadowrylander
Copy link

Or wait; was the comment by @samueldr not for me originally?

@theotherjimmy
Copy link

@shadowrylander It was probably meant for all of us.

@samueldr samueldr changed the title Suspend to RAM with tsys' kernel tracking issue Suspend to RAM with mainline May 11, 2021
@samueldr
Copy link
Owner Author

samueldr commented May 11, 2021

Do we need #7's changes for this to work? Namely ROCKCHIP_SIP and ROCKCHIP_SUSPEND_MODE.

(I still haven't taken the time to actively test...)

@theotherjimmy
Copy link

Do we need #7's changes for this to work? Namely ROCKCHIP_SIP and ROCKCHIP_SUSPEND_MODE.

No. closed #7

@samueldr
Copy link
Owner Author

I can verify that, with the default Tow-Boot build for the Pinebook Pro, which at the time includes the patch this works for me on 5.11.

@theotherjimmy
Copy link

@samueldr Thanks for being one of the first tester's that's not me! I feel a lot better knowing that my results have been reproduced.

@samueldr
Copy link
Owner Author

@theotherjimmy not knowing much about all this, I still feel the comments about how it may or may not actually work depending on the conditions it resumes from, from the reviews, are probably valid.

But at least in a limited testing it seems to work.

One time the PBP panic'd, it had slept for a short while, but it panic'd long after resuming.

@theotherjimmy
Copy link

Oh, dang. That's probably related to not restoring the lower frequency as the reviewer suggested might be the case.

That being said, without debugging, I have no idea.

@samueldr
Copy link
Owner Author

Exactly, and I tried reproducing, left the pinebook pro under similar conditions, booted, slept not too long after for not long (not even a minute I think). Then left the pinebook pro on, without display suspend.

While the time it panic'd it was I think under 12 hours, leaving it ~36 hours on didn't seem to reproduce the issue.

This is going to be a hard one to reproduce, if indeed it is related to the suspend/resume cycle and that suggestion.

@theotherjimmy
Copy link

theotherjimmy commented May 25, 2021

Honestly, if it's panicing, RAM is working. Unless we're seeing corruption.

@samueldr
Copy link
Owner Author

I wouldn't know enough to confirm or deny :)

@zhaofengli
Copy link

Tried out the patch with an NVMe drive, and the drive is frozen upon wake up. It still shows up in lspci but any operation against the drive hangs. I think the behavior is consistent with suspending with the BSP U-Boot + TF-A.

[ 1934.301295] INFO: task fdisk:3791 blocked for more than 966 seconds.
[ 1934.304747]       Tainted: P         C O      5.10.35 #1-NixOS
[ 1934.308194] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1934.311687] task:fdisk           state:D stack:    0 pid: 3791 ppid:  3780 flags:0x00000001
[ 1934.311698] Call trace:
[ 1934.311713]  __switch_to+0x10c/0x168
[ 1934.311723]  __schedule+0x2c4/0x738
[ 1934.311729]  schedule+0x50/0xd8
[ 1934.311737]  blk_queue_enter+0x138/0x290
[ 1934.311742]  submit_bio_noacct+0x364/0x400
[ 1934.311748]  submit_bio+0x54/0x1e0
[ 1934.311754]  mpage_readahead+0x154/0x188
[ 1934.311760]  blkdev_readahead+0x20/0x30
[ 1934.311767]  read_pages+0xa0/0x288
[ 1934.311771]  page_cache_ra_unbounded+0x13c/0x218
[ 1934.311776]  do_page_cache_ra+0x48/0x58
[ 1934.311780]  force_page_cache_ra+0xb0/0x108
[ 1934.311785]  page_cache_sync_ra+0x54/0x120
[ 1934.311791]  generic_file_buffered_read+0x4b8/0xa30
[ 1934.311796]  generic_file_read_iter+0x108/0x1a8
[ 1934.311802]  blkdev_read_iter+0x44/0x58
[ 1934.311808]  new_sync_read+0xf0/0x188
[ 1934.311813]  vfs_read+0x150/0x1e0
[ 1934.311818]  ksys_read+0x74/0x100
[ 1934.311823]  __arm64_sys_read+0x24/0x30
[ 1934.311830]  el0_svc_common.constprop.0+0x80/0x1a8
[ 1934.311835]  do_el0_svc+0x2c/0x98
[ 1934.311841]  el0_svc+0x20/0x30
[ 1934.311846]  el0_sync_handler+0xb0/0xb8
[ 1934.311852]  el0_sync+0x178/0x180
[ 1956.829228] nvme nvme0: I/O 9 QID 0 timeout, completion polled
[ 1956.829375] nvme nvme0: 6/0/0 default/read/poll queues
[ 1987.549123] nvme nvme0: I/O 325 QID 2 timeout, aborting
[ 2018.268958] nvme nvme0: I/O 2 QID 0 timeout, completion polled
[ 2018.269094] nvme nvme0: Abort status: 0x0
[ 2018.269158]  nvme0n1: p1
[ 2079.708782] nvme nvme0: I/O 13 QID 0 timeout, completion polled
[ 2141.148592] nvme nvme0: I/O 14 QID 0 timeout, completion polled

@theotherjimmy
Copy link

Yes, I think that's expected behavior ATM. I'd love to fix it, as I now have a NVMe in my PBP, but it's extremely low on my priority list.

@miniBill
Copy link
Contributor

miniBill commented Apr 4, 2022

What's the current status of suspend to RAM?

@yatli
Copy link

yatli commented Jun 7, 2022

Hey guys, I'm trying to bring s2ram to mainline for DevTerm A06 (rk3399).
https://forum.clockworkpi.com/t/getting-suspend-to-work-properly-on-a06/8404/18?u=yatli

My "working branch" is here :)
https://github.com/yatli/arm-trusted-firmware/tree/rk3399_dev

I've marked and aligned some routines from the rkbin bl31.elf but there're still missing pieces: the way ATF accepts aux parameters, the mysterious PMUGRF_OS_REG2, the way PSCI is called (not returning from WFI, but jumping into the suspend_finish routine -- the DevTerm never reaches suspend_finish)

I'm wondering if you have suggestions and tips? Currently it goes into deep sleep but it's not coming back. I'm fairly new to all this but gradually finding my way around...

It'd be also cool to be able to set up debug uart or debugger, as currently all I have is to output debug bits with the onboard FAN spinning/not spinning......

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
known-issue Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants