Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] arm32: introduce suspend/resume common functions #1729

Merged
merged 11 commits into from
Sep 18, 2017

Conversation

MrVan
Copy link
Contributor

@MrVan MrVan commented Aug 2, 2017

This is to introduce low power suspend/resume support for arm32. The work flow is similar as linux suspend/resume, the different is linux suspend/resume saves/restores linux side contexts and system registers, while the optee suspend/resume saves/restores tee side contexts and system registers. Also some differences that ttbr0 is saved/restored.
Now only 1 cluster max 4 cores supported and with Cortex-A7/A9 MP core tested.
This piece code is not the whole picture, but these are all the common code I implemented, others are platform specific code, such as platform normally works at svc mode when reset, but when tee resume we need to switch to monitor mode because psci frameworks runs in monitor mode. And code handling platform suspend/resume coping into onchip ram that could survive cpu suspend/resume.

@MrVan
Copy link
Contributor Author

MrVan commented Aug 8, 2017

Are there any comments about this RFC patch set?

read_nmrr r5
read_vbar r6
read_nsacr r7
stmia r0, {r4 - r7}
Copy link
Contributor

@igoropaniuk igoropaniuk Aug 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stmia r0!, {r4 - r7} (! missed) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to use r0! here, this is the last inst to store register to stack and then r0 is not needed.

* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header guards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix in V2

@jforissier
Copy link
Contributor

@MrVan FYI the Travis build failure can be fixed by rebasing on top of the current master branch, in order to to get 9aea3dc ("Travis: add -p to mkdir $HOME/bin").

@MrVan
Copy link
Contributor Author

MrVan commented Aug 14, 2017

Patch updated with addressing @igoropaniuk 's comments and rebased with 9aea3dc ("Travis: add -p to mkdir $HOME/bin"). Any comments?

.section .text
FUNC cpu_suspend, :
UNWIND( .fnstart)
stmfd sp!, {r4 - r11, lr}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend push {r4-r11, lr}
Push one more register for 8byte stack alignment.
Add UNWIND(.save {r4-r11, lr})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try this. Thanks.

*/
and r6, r6, #0x3
lsl r6, r6, #0x3
stmfd sp!, {r0, r1}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend push {r0, r1}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

ldr r4, =suspend_reg_size
add r4, r4, #12
add sp, sp, r4
ldmia sp!, {r4 - r11, lr}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend pop {r4-r11, lr}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

ldr r4, =suspend_reg_size
mov r5, sp
add r4, r4, #12
sub sp, sp, r4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how unwind could handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to reserve space in stack for saving regs that will lost when cpu losing power. unwind may not able to handle this. I may need to add KEEP_PAGER the functions.

* the &sleep_save_sp_virt = ((u32)suspend_sp + 4 * cpu_id)
*/
and r6, r6, #0x3
lsl r6, r6, #0x3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lsl r6, r6, #0x2 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Each cpu needs 2 words(8 bytes), lsl r6, r6, #3, means r6 * 8

ldr r3, =cpu_resume_after_mmu
isb
write_sctlr r0
mrc p15, 0, r0, c0, c0, 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you cross check this (from here to the 2nd mov r0, r0 below. Looks useless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_sctlr r0 actually will enable MMU. Yeah, that could removed on ARMV7 cpus. This could be

write_tlbiall
write_sctlr r0
dsb
isb
``

*/
read_diag r10
teq r5, r10
mcrne p15, 0, r5, c15, c0, 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below: is it harmful to restore an already updated value ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This could be simplified. No one actually restore it before this point.

* Each cpu needs 2 words to save ptr and virt_to_phys(ptr).
* We max support 4 cpus now,so need 4 * 2 * sizeof(uint32_t);
*/
uint32_t suspend_sp[4 * 2] __aligned(8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more convenient to store in struct thread_core_local. What's the alignment for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. it maybe a bit hard to refer thread_core_local, i'll try. No specific requirement on the alignment, could be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a thought, thread_core_local could not be used here. Because suspend_sp will be used when cpu resume at very early stage to restore registers from sp(physical address). At this time,accessing thread_core_local is not easy, also have to fix the layout of thread_core_local for the saved sp pointer. I prefer use suspend_sp here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thread_core_local could be made a global variable just as suspend_sp is here, I don't see what's different.
core/arch/arm/kernel/asm-defines.c can be used to find offsets of different elements in the struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll try this. please hold on for the review patch, I missed this comments when updating the patch.

* The content will be passed to cpu_do_resume
* as register r1 and sp
*/
*ptr++ = 0; /* Not used now */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a proper struct for this instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add a struct here. The layout will be refered in asm code, It is clear using current method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me.
We use structs at several places in the code that are accessed from assembly code, both for convenience and as documentation. There's core/arch/arm/kernel/asm-defines.c to help if needed.

*/
.align 2
.global suspend_reg_size
.equ suspend_reg_size, 4 * 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't a regular define be used instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give a try.

@MrVan
Copy link
Contributor Author

MrVan commented Aug 19, 2017

@jenswi-linaro @etienne-lms Update with your review comments addressed. Please check whether this is ok for you. I have not do board test for the new review patch, later I'll test it.

@@ -52,6 +53,14 @@ DEFINES
DEFINE(THREAD_SVC_REG_R0, offsetof(struct thread_svc_regs, r0));
DEFINE(THREAD_SVC_REG_R5, offsetof(struct thread_svc_regs, r5));
DEFINE(THREAD_SVC_REG_R6, offsetof(struct thread_svc_regs, r6));

DEFINE(PM_CTX_SP, offsetof(struct sm_pm_ctx, sp));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/PM_CTX/SM_PM_CTX/ for these four lines

* as register r1 and sp
*/
ptr->sp = sp;
ptr->cpu_resume_addr = virt_to_phys(u.resume_addr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd cast as (void *)(vaddr_t)cpu_do_resume instead you wouldn't need to involve a union.

read_mpidr r1
and r1, r1, #0x3

adr r0, _suspend_sp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ldr r0, =suspend_sp instead?

If you we're using thread_core_local instead the sequence would be something like (and then we'd get rid of the assumption above too):

bl get_core_pos
mul r0, r0, #THREAD_CORE_LOCAL_SIZE
ldr r1, =thread_core_local
add r1, r1, r0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point mmu is not enabled. bl get_core_pos could not be used here. I'll use thread_core_local here.

Copy link
Contributor

@jenswi-linaro jenswi-linaro Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't get_core_pos() flat mapped?

Edit:
By the way, get_core_pos() is called quite early in the boot before mmu is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not rely on physical address equals to virtual address. I could use bl get_core_pos here, but when virtual address is not same physical address, this needs to rewrite. I could add a piece code just as get thread_core_local phyiscal address to get get_core_pos physical address here.

#include <keep.h>
#include <kernel/unwind.h>

#define PRRR 0x034
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused defines?

FUNC cpu_resume, :
UNWIND(.fnstart)
/* Disable IRQ/FIQ */
mrs r1, cpsr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpsid if is only one instruction. I'd recommend disabling async aborts too, so then it would be cpsid aif.

write_sctlr r0
dsb
isb
nop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reference to why you need exactly three nops?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reference, just would like to flush instruction pipe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that needed? Handling an erratum?

adr lr, cpu_suspend_panic
/* Jump to arch specific suspend*/
pop {r0, pc}
/* cpu not suspended */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen, considering that lr is pointing to cpu_suspend_panic()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bug here. This will be fixed in new version patches.

blx cpu_suspend_save
adr lr, cpu_suspend_panic
/* Jump to arch specific suspend*/
pop {r0, pc}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the value of pc supposed to be written?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu_suspend(arg, platform_suspend_func), Here pc is platform_suspend_func.

push {r4, r5}
read_midr r4
ubfx r5, r4, #4, #12
ldr r4, =0xc07
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This magic number could be a defined in <arm32.h> instead.
The assignment could also use mov_imm instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using mov_imm, "Error: invalid constant (c07) after fixup"

@MrVan MrVan force-pushed the lpm branch 3 times, most recently from 6a32dcf to 5b54219 Compare August 22, 2017 02:55
@MrVan
Copy link
Contributor Author

MrVan commented Aug 22, 2017

@jenswi-linaro I updated the patches, not review patch. I have tested it on mx7dsabresd, with suspend and dual-core low power idle, works well. I have switched to use thread_core_local and addresses a few bugs in the previous review patch which not tested.

@@ -33,6 +33,8 @@
#include <stdint.h>
#include <util.h>

#define A7_PART_NUM 0xC07
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`s/A7_PART_NUM/CORTEX_A7_PART_NUM/
Please add

#define MIDR_PRIMARY_PART_NUM_SHIFT 4
#define MIDR_PRIMAGE_PART_NUM_WIDTH 12

To be able to get rid of the other two magic numbers in the assembly code concerning MIDR below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

};

/* suspend/resume core functions */
void cpu_suspend_init(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a "sm_pm" prefix over a "cpu" prefix for these functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is just a place holder, I could remove it, since not used now. Actually I would prefer cpu means this piece code is only related to cpu core, not about device in a SoC. This function will be removed in new version patches.

#include <sm/sm.h>
#include <sm/pm.h>

#include "../kernel/thread_private.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please export struct thread_core_local in <thread.h> instead of including "../kernel/thread_private.h" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


FUNC cpu_v7_suspend, :
UNWIND(.fnstart)
push {r4 - r12, lr}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to save r12 and lr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r12 may not needed, but lr is needed. cpu_suspend_save->cpu_do_suspend->cpu_v7_suspend. lr is needed to back to cpu_suspend_save. r12 here is just make stack 8 bytes aligned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no calls in this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll refine this.

#include <kernel/unwind.h>

.section .text
FUNC cpu_suspend, :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment describing arguments and return value of each exported function in this file.
Something like:

/*
 * int cpu_suspend(uint32_t arg, int (*fn)(uint32_t));
 */

Is enough. It doesn't hurt to describe the local functions in the same manner but it's not as important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

FUNC cpu_resume_mmu, :
UNWIND(.fnstart)
ldr r3, =cpu_resume_after_mmu
write_tlbiall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLB was already invalidated in cpu_do_resume(), why is that needed again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed. I'll drop it.

nop
nop
nop
bx r3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just a simple b cpu_resume_after_mmu?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_sctlr r0 will enable MMU. After mmu enabled, bx r3 to jump to cpu_resume_after_mmu.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The address you jump to will be the same so why make it complicated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll try it.

UNWIND( .fnend)
END_FUNC cpu_resume

FUNC cpu_resume_after_mmu, :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOCAL_FUNC

UNWIND( .fnend)
END_FUNC cpu_resume_after_mmu

FUNC cpu_resume_mmu, :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOCAL_FUNC

UNWIND( .fnend)
END_FUNC cpu_do_suspend

FUNC cpu_resume, :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOCAL_FUNC

@MrVan
Copy link
Contributor Author

MrVan commented Aug 24, 2017

@jenswi-linaro Updated with a single review patch. I understand code is a bit hard to read, thanks for your detailed reviewing. Patch has tested on mx7dsabresd board, with suspend/resume low power idle and xtest.

@MrVan
Copy link
Contributor Author

MrVan commented Aug 25, 2017

Any more comments on this? In my side, it passed 24 hours test on mx7dsabresd.

@jenswi-linaro
Copy link
Contributor

I guess we're all waiting for the remaining patches to get the full picture.

@MrVan
Copy link
Contributor Author

MrVan commented Aug 25, 2017

ok. I'll clean up platform specific code and send out as whole for review.

@MrVan
Copy link
Contributor Author

MrVan commented Aug 28, 2017

Updated the patchset to include platform specific code. The PATCH 9/9 is not good enough to upstream, if ok, I would prefer other 8 patches go upstream and leave the last patch. welcome comments.

ldr r2, [r0]
add r0, r0, r2

/* Assmue max 4 cpus and 1 cluster */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is generic code, can't we try to support more cpus and more than 1 cluster from start? 4x2 seems pretty common these days.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbech-linaro This patchset is to support armv7 which mostly only contains 4 cores. Regarding armv8 aarch32, I do not have platforms to test. I could use simliar method as accessing thread_core_local here to use get_core_pos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, well I think this would serve as a good start and we can deal with that question later on. I'm not familiar with imx-specfics, but all in all I think this looks good, thanks!


/*
* Store sec ctx regs
* Now we are in SVC sec mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment accurate? I can't find where the switch to SVC sec mode is done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be monitor mode. Also in the asm mode switch code, it should be back to monitor code and save svc mode, as the following:

-               "mov    r3, #0x16\r\n" /* MON */
+               "mov    r3, #0x13\r\n" /* SVC */
                "orr    r3, r3, r4\r\n"
                "msr    cpsr, r3\r\n"
                "mrs    r1, spsr\r\n"
                "str    r1, [r2], #4\r\n"
                "str    sp, [r2], #4\r\n"
                "str    lr, [r2], #4\r\n"
-               "msr    cpsr, r0\r\n"           /* SVC mode */
+               "msr    cpsr, r0\r\n"   /* Mon mode */

When pushed to github, missed to fix this.

* Store sec ctx regs
* Now we are in SVC sec mode
*/
asm volatile (
Copy link
Contributor

@jenswi-linaro jenswi-linaro Aug 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a lot like what sm_save_modes_regs() is doing, couldn't sm_save_modes_regs(&nsec->mode_regs) be used instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check that. After reviewing patch 9/9, you could get the whole picture of how it works:)

regs = &reg_stack[0];

/*
* Store sec ctx regs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually nonsecure ctx regs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non secure regs are saved by non-secure world by itself. This is to save secure ctx, because cpu lose power, all regs will be lost, so we need to save secure ctx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The secure registers hasn't been loaded by the secure monitor yet so the content is actually non-secure, unless I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstand. yes. secure registers not loaded. This is save non-secure context. I'll correct the comment.

plat_cpu_reset_late();

/* Restore register of different mode in secure world */
asm volatile (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sm_save_modes_regs() is used above, use sm_restore_modes_regs(&nsec->mode_regs) instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is restore secure world context. Ditto, I'll check using sm_restore_modes_regs.

str r7, [r11, #0x38]
wait_shutdown:
wfi
nop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the nop instructions really needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This similar code has been in FSL/NXP releases for long time. I do not want to remove them for potential risk here.

wfi:
/* Enter stop mode */
wfi
nop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these nop instructions needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

mov r4, #(BIT32(6) | BIT32(7))
orr r3, r3, r4
msr cpsr, r3
nop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nop for a particular reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I'll remove it.

bne _inv_nextway

dsb @ ; synchronise
nop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't an isb instead of these two nop be more appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.


resume:
write_iciallu
write_bpiall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should perhaps have dsb isb after this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. will add them.

isb

/* switch to monitor mode from svc mode */
mov r3, #CPSR_MODE_MON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cps #CPSR_MODE_MON instead of this long sequence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@MrVan
Copy link
Contributor Author

MrVan commented Aug 30, 2017

@jbech-linaro Updated with using get_core_pos in suspend.S @jenswi-linaro Updated with addressing comments for psci-suspend-imx7.S.

Sending out the imx specific things to let people could get a whole picture, but here the first thing I would move forward is upstreaming the common parts, saying patch 1/9-8/9. I am glad, if you think the last patch 9/9 is good to be accepted. In next version patch, I will extract the psci macros into standalone patch from patch 9/9.

: "r0", "r1", "r2", "r3", "r4"
);
/* Store non-sec ctx regs */
sm_save_modes_regs(&mode_regs[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&nsec->mode_regs is not used at this stage and could be used instead of the local variable mode_regs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll use that in next version patch after receiving more comments.


FUNC cpu_do_suspend, :
UNWIND(.fnstart)
push {r4, r5}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either UNWIND(.cantunwind) above or add UNWIND(.save {r4, r5}) below.

add r0, sp, #8
blx cpu_suspend_save
adr lr, aborted
/* Jump to arch specific suspend*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: space before */

push {r4 - r12, lr}
UNWIND(.save {r4 - r12, lr})
mov r5, sp
sub sp, sp, #SM_PM_CTX_SIZE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unwinding sequence will get lost with this, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have much knowledge about unwind. Here it may lost. I would not like to change the design here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should replace .save {r4 - r12, lr} with .cantunwind

END_FUNC cpu_suspend

LOCAL_FUNC cpu_v7_suspend, :
UNWIND(.fnstart)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNWIND(.fnstart) or UNWIND(.cantunwind) ?

@@ -481,6 +481,8 @@ static inline TEE_Result cache_op_outer(enum cache_op op __unused,
/* Check cpu mmu enabled or not */
bool cpu_mmu_enabled(void);

void map_memarea_sections(const struct tee_mmap_region *mm, uint32_t *ttb);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is core_mmu_v7.c, specific not LPAE. A generic cpu_mmu_map_memarea() could direct to the mmu_v7/lpae map functions. Lpae would be not supported at first I guess.
minor: can add a description since the function is now exported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have platforms with LPAE enabled. I'll add a description.

@@ -78,4 +78,8 @@ $(call force,CFG_SECURE_TIME_SOURCE_REE,y)
CFG_BOOT_SECONDARY_REQUEST ?= y
endif

ifeq ($(filter y, $(CFG_PSCI_ARM32)), y)
CFG_CORE_RWDATA_NOEXEC ?= n
endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend CFG_HWSUPP_MEM_PERM_WXN=n and keep CFG_CORE_RWDATA_NOEXEC=y.
The later insures optee_os mapping splits executable and data, and the former does not enforce write-implies-noexec from SCTLR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll use CFG_HWSUPP_MEM_PERM_WXN ?= n in next version.

}

/* Note IRAM_S_BASE is not 1M aligned, so take care */
map.pa = ROUNDDOWN(IRAM_S_BASE, 0x100000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: s/0x100000/CORE_MMU_PGDIR_SIZE/, here and below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix in next version.

uint32_t (*ddrc_phy_offset_array)[2];
uint32_t suspend_ocram_base = (uint32_t)phys_to_virt(
(paddr_t)TRUSTZONE_OCRAM_START +
SUSPEND_OCRAM_OFFSET, MEM_AREA_TEE_COHERENT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is coherent memory, we can expect phys=virt. Maybe use phys_to_virt() only under an assert().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not get you. Could you please explain more?

isb
mov r0, #0
/* Back to caller of cpu_suspend, with return value 0 */
pop {r4 - r12, pc}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on the push in cpu_suspend() above. It should be noted somehow. Perhaps we could just jump to the pop instruction in cpu_suspend()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is back to the next instruction just after saying bl cpu_suspend. The pop instruction in cpu_suspend is that cpu not suspended. Here the pop in cpu_resume path is cpu really suspened. They are in different code path. Let's not use jump to mix the two path.


FUNC cpu_resume_mmu, :
UNWIND(.fnstart)
isb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isb not needed, done just before jumping here.
I'm not sure we need this function at all since it doesn't do that much. I think the code would be easier to follow/understand if cpu_do_suspend() and cpu_do_resume() did match each other more. The helper functions cpu_v7_suspend() and cpu_resume_mmu() changes the symmetry more than isolating different problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_sctlr r0 will turn on mmu. To keep safe, I prefer we keep isb here. cpu_resume_mmu here is to turn on mmu, and run in virtual address, before cpu_resume_mmu, all is runing in physical address. cpu_v7_suspend is common for armv7 here. But regarding a7/a9/a8/a15 and etc, cpu_do_suspend could handle. I think better to keep them.

@MrVan
Copy link
Contributor Author

MrVan commented Sep 11, 2017

@etienne-lms Updated. struct thread_core_local from thread_private.h to thread.h was done in a specific patch; r2 and r3 are not removed in psci-suspend-imx7.S at the beginning, they are not used now; add comments for _core_pos; sorted the includes for psci-sus-end-imx7.S.
Tested on mx7dsabresd board.
Thanks for revewing.

Copy link
Contributor

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more minor comments + check bp cache sync before restoring sctlr (requitred on c-a9).

Comments on commit messages:

  • 2nd commit "core: arm: psci: pass nsec ctx to psci" rephrase proposal (expands the shortcuts)
    Pass non-secure context to psci functions. When cpu/system suspends,
    cpu may loose power, so when back to linux from tee, tee
    needs to return to a linux resume point, not the usual return address
    after "smc" instruction. So we need to modify the mon_lr
    value in non-secure context.

    The function in linux side is psci_system_suspend or
    psci_ops.cpu_suspend(state, __pa(cpu_resume)). (remove this part)

    Psci runs in monitor mode, sm_get_nsec_ctx can not be used,
    so pass the non-secure context pointer to the psci suspend function.

  • commit3: "core: arm: psci: add suspend resume common functions"
    s/platform/Platform/
    s/that not losing/that is not loosing/

    Last sentence: The "arg" passed to "func" as "r0"
    Proposal: Argument "arg" is passed to function "func" as argument through register "r0".

  • core: imx: simplify code
    s/registeration/registration/

@@ -1,5 +1,6 @@
#include <kernel/thread.h>
#include <stdint.h>
#include <sm/sm.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: move before #include <stdint.h>`


#ifdef ARM64
/*
* struct thread_core_local need to have alignment suitable for a stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: s/need/needs/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just moving code from thread_private.h to thread.h. Anyway I'll fix it.

#include <stdint.h>
#include <sm/optee_smc.h>
#include <sm/psci.h>
#include <sm/sm.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: fix include order:

#include <sm/optee_smc.h>
#include <sm/psci.h>
#include <sm/pm.h>                               ... moved up
#include <sm/sm.h>
#include <stdint.h>                              ... moved down

ctx->cpu_resume_addr =
virt_to_phys((void *)(vaddr_t)sm_pm_cpu_do_resume);

sm_pm_cpu_do_suspend(&ctx->suspend_regs[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: s/&ctx->suspend_regs[0]/ctx->suspend_regs/

#include <kernel/asan.h>
#include <platform_config.h>
#include <keep.h>
#include <kernel/unwind.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: remove #include <kernel/unwind.h> at list top and fix includes ordering.

isb
dsb
/* MMU will be enabled here */
write_sctlr r8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will enable branch prediction. Needs a write_bpiall before dsb/isb ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done psci-suspend-imx7.S, not in this pm_a32.S, is this ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From psci-suspend-imx7.S,

  • if cores stay up (standby case): the sequence enables bp from imx7_suspend, then enables mmu here. I think bp should be enabled only after mmu is enabled (for cores requiring a manul enable of bp).
  • In cpu shutdown case, cpu wakes to resume. In this case, it seems to me that both mmu and bp will be both enables from here write_sctlr r8, without previous bp cache cleanup (write_bpiall).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cores stay up, it will not run into cpu resume. In psci-suspend-imx7.S, there are two wfi. If cpu stay up, cpu will directly runs into the following instruction of wfi.

If cpu shutdown, write_sctlr will restore branch predictor. In psci-suspend-imx7.S, line 645. If you think it is required to add a write_bpiall here before dsb/isb, I could add one here to handle the case that platform specific code missed to add write_bpiall in resume path.


FUNC sm_pm_cpu_do_suspend, :
UNWIND(.fnstart)
UNWIND(.cantunwind)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: for consistency, add a tab after UNWIND(, here and there.

@@ -85,7 +85,7 @@ void plat_cpu_reset_late(void)
write32(0x00FF0033, core_mmu_get_va(CSU_CSL_15, MEM_AREA_IO_SEC));
/*
* Proect SRC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe s/Proect/Protect/

register_phys_mem(MEM_AREA_IO_SEC, ANATOP_BASE, CORE_MMU_DEVICE_SIZE);
#endif
#ifdef GICD_BASE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: GIC_BASE is likely to cover GICC and GICD.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On i.MX8, there is GICD and GICR, no GICC used here. Since this piece code is shared to i.mx family. I seperated them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

if (!suspended_init) {
imx7_suspend_init();
suspended_init = 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: imx7_suspend_init() could be registered through init_service().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not change it now in this patchset. Later with new patches, I could change it and do a new round stress test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@MrVan
Copy link
Contributor Author

MrVan commented Sep 12, 2017

Update the commit messages according to @etienne-lms 's suggestion. Reordered the include files. need->needs in thread.h Use ctx->suspend_regs to replace &ctx->suspend_regs[0] in pm.c. using tab in UNWIND. Add a write_bpiall before enabling mmu in resume path. And a few minor fixes.

Thanks for reviewing.

@MrVan
Copy link
Contributor Author

MrVan commented Sep 15, 2017 via email

@MrVan
Copy link
Contributor Author

MrVan commented Sep 15, 2017

@jenswi-linaro @etienne-lms do you have more comments? Is this patchset ok to have your r-b/a-b tags?

Copy link
Contributor

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my minor comments addressed:

For commits that changes below core/arch/arm/plat-imx:
Acked-by: Jens Wiklander <[email protected]>
For the other changes:
Reviewed-by: Jens Wiklander <[email protected]>

@@ -279,6 +280,8 @@ struct thread_handlers {
void thread_init_primary(const struct thread_handlers *handlers);
void thread_init_per_cpu(void);

struct thread_core_local *thread_get_core_local(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should go into the "core: arm: kernel: make thread_core_local public" commit

@@ -89,7 +89,7 @@

struct thread_ctx threads[CFG_NUM_THREADS];

static struct thread_core_local thread_core_local[CFG_TEE_CORE_NB_CORE];
struct thread_core_local thread_core_local[CFG_TEE_CORE_NB_CORE];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should go into the "core: arm: kernel: make thread_core_local public" commit

@@ -163,8 +163,6 @@ void thread_init_vbar(void);
/* Handles a stdcall, r0-r7 holds the parameters */
void thread_std_smc_entry(void);

struct thread_core_local *thread_get_core_local(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should go into the "core: arm: kernel: make thread_core_local public" commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jenswi-linaro. I'll fix this, then add your tags.

@MrVan
Copy link
Contributor Author

MrVan commented Sep 15, 2017

"struct thread_core_local *thread_get_core_local(void);" changes moved to "core: arm: kernel: make thread_core_local public" commit.
Rebased to master
Added r-b/a-b tags

@MrVan
Copy link
Contributor Author

MrVan commented Sep 16, 2017

Missed one comment from @jenswi-linaro in last patchset:

-static struct thread_core_local thread_core_local[CFG_TEE_CORE_NB_CORE];
+struct thread_core_local thread_core_local[CFG_TEE_CORE_NB_CORE];

This change has been address in lastest patchset, with TAGS applied. Thanks.

Copy link
Contributor

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpatch warn on (a fix from a comment of mine)

  • commit3: "core: arm: psci: add suspend resume common functions"
    WARNING: 'loosing' may be misspelled - perhaps 'losing'?
    => s/that not loosing/that is not losing/

reviewed-by
core: mmu: export map_memarea_sections
core: arm: psci: pass nsec ctx to psci
core: arm: kernel: make thread_core_local public
core: arm: psci: add suspend resume common functions (see last comment)
core: arm: imx: move psci code to pm
core: arm: sm: add psci power state macros

acked-by
core: imx7: fix comments
core: imx: simplify code
core: arm: imx: get mmdc type
core: arm: imx7d: remove soc_is_imx7d/s functions
core: arm: imx7d: add psci suspend support

DEFINE(THREAD_CORE_LOCAL_SIZE, sizeof(struct thread_core_local));

DEFINE(SM_PM_CTX_SP, offsetof(struct sm_pm_ctx, sp));
DEFINE(SM_PM_CTX_RESUME_ADDR,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: SM_PM_CTX_SP, SM_PM_CTX_RESUME_ADDR and SM_PM_CTX_SUSPEND_ADDR and not used.
SM_PM_CTX_SUSPEND_ADDR store the resume address. Maybe just remove these defines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll remove them.

Export map_memarea_sections. We need a mmu table
dedicated for low power feature, so export
map_memarea_sections to create that section mapping.

Signed-off-by: Peng Fan <[email protected]>
Reviewed-by: Jens Wiklander <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Pass non-secure context to psci functions. When cpu/system suspends,
cpu may loose power, so when back to linux from tee, tee
needs to return to a linux resume point, not the usual return address
after "smc" instruction. So we need to modify the mon_lr
value in non-secure context.

Psci runs in monitor mode, sm_get_nsec_ctx can not be used,
so pass the non-secure context pointer to the psci suspend function.

Signed-off-by: Peng Fan <[email protected]>
Reviewed-by: Jens Wiklander <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Move the struture of thread_core_local from thread_private.h
to thread.h to make it public.

Signed-off-by: Peng Fan <[email protected]>
Reviewed-by: Jens Wiklander <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Add cpu suspend/resume common functions.

Platform psci suspend functions need to call
sm_pm_cpu_suspend(arg, platform_suspend) to runs into suspend.

The i.MX flow is:
psci_cpu_suspend->imx7_cpu_suspend->sm_pm_cpu_suspend(arg, func)
The "func" runs in on-chip ram that not losing power when
system runs into suspend or low power state. Argument "arg" is
passed to function "func" as argument through register "r0".

Signed-off-by: Peng Fan <[email protected]>
Reviewed-by: Jens Wiklander <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Move psci code to pm.

Signed-off-by: Peng Fan <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Fix comments.

Signed-off-by: Peng Fan <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
Acked-by: Etienne Carriere <[email protected]>
Wrap memory registration using macros to make it easy to add new soc/arch
support.

Signed-off-by: Peng Fan <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
Acked-by: Etienne Carriere <[email protected]>
Add get mmdc type support, this will be used when configuring
ddr into self refresh for low power feature.

Signed-off-by: Peng Fan <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
Acked-by: Etienne Carriere <[email protected]>
Add PSCI_POWER_STATE_X macros

Signed-off-by: Peng Fan <[email protected]>
Reviewed-by: Jens Wiklander <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Remove soc_is_imx7d/s functions. Not needed.

Signed-off-by: Peng Fan <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
Acked-by: Etienne Carriere <[email protected]>
Implement i.MX7D suspend/resume support.
When the first time runs into suspend, some initialization work needs
to be done, such as code copy, iram translation table.

Since we only have 32K on chip RAM for suspend/resume usage, we have
to put code and data together and use section mapping and WXN is set
to false.

Signed-off-by: Peng Fan <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
Acked-by: Etienne Carriere <[email protected]>
@MrVan
Copy link
Contributor Author

MrVan commented Sep 18, 2017

checkpatch warning fixed for commit3; uneeded SM_PM_CTX_xx macros removed.
Applied @etienne-lms 's r-b/a-b tags;

@jforissier
Copy link
Contributor

@MrVan thanks for this valuable contribution.

@jforissier jforissier merged commit 1295874 into OP-TEE:master Sep 18, 2017
@jbech-linaro
Copy link
Contributor

@MrVan thanks for this valuable contribution.

Indeed, thanks a lot for all the patience here!

@glneo glneo mentioned this pull request Sep 18, 2017
@MrVan MrVan deleted the lpm branch September 19, 2017 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants