Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.8.0-develop: Firmware hangs, no watchdog reset #60

Closed
dwagenk opened this issue Sep 2, 2020 · 11 comments
Closed

0.8.0-develop: Firmware hangs, no watchdog reset #60

dwagenk opened this issue Sep 2, 2020 · 11 comments

Comments

@dwagenk
Copy link

dwagenk commented Sep 2, 2020

Pressing the button to put the device to sleep while touching the screen causes the device to lock up. Rebooting it by long-pressing the button doesn't work. I've had to open it and reset it (briefly short the power via the debug connector, usually happens to me when trying to fiddle the debug cables in there).

If you can't reproduce the behavior I'll retry with connected debug probe to try getting more information on what is actually happening.

V_20200902_222020_1

@lupyuen
Copy link
Collaborator

lupyuen commented Sep 2, 2020

I noticed similar behaviour with 0.8.0 too. I called it the "PineTime Defibrillator Syndrome"...

  1. I was charging PineTime on the cradle. It powered on. Pressing the button was OK

  2. While charging, the screen blacked out. Button did not respond. Maybe something came loose and it stopped charging

  3. I tried sticking close the PineTime cover really tight, put it back on the cradle. Still nothing on the screen. Button did not respond

  4. Here's the very spooky thing... I opened the back cover. I tapped the Pogo Pins on the PineTime SWD Port. PineTime came back to life!

  5. The spooky thing: The Pogo Pins were connected to ST-Link. BUT ST-Link was NOT connected to USB. There's no power at all!

  6. I closed up the back cover again, took some pics, button was OK. After that the screen went blank. Maybe the battery drained

  7. I put PineTime back on the cradle, screen still didnt come on. Button did not respond

  8. I opened the back cover, tapped the Pogo Pins on the SWD Port again. It came back to life! Screen was OK, button was OK

So that's the PineTime Defibrillator Syndrome... Somehow it needs something to jolt it back to life

I recall some folks having a similar problem with other firmware... PineTime is charged up but doesn't turn on. Could it be the same issue?

@dwagenk
Copy link
Author

dwagenk commented Sep 3, 2020

Regarding points 4,5,8 at least with my clumsy fingers that behavior is due to briefly shorting the VCC and GND with the debug header when trying to get it in there. So that should trigger a reset due to brown-out detection, or, if that is not set up (don't know, if it needs some configuration on Nordic MCUs) by the short lack of supply voltage.

@JF002
Copy link
Collaborator

JF002 commented Sep 8, 2020

@dwagenk Thanks for this very accurate description. The video was very helpful, and I also reproduce this issue!
The fact that it freezes is stange, but I cannot understand why the watchdog does not reboot the watch! I'll have to analyze that!

@lupyuen Maybe you hit the same but as @dwagenk, and that you created a small short when tapping the pogo in the SWD connector, which initiated a hardware reset?

@JF002
Copy link
Collaborator

JF002 commented Sep 9, 2020

I think this bug is caused by a race condition between IRQ and device (re)init when the watch is woken up : the ISR is called before the SPI/TWI devices are correctly reconfigured.
This cause an infinite while loop inside the Display task. It doesn't trigger the watchdog because SystemTask is still running correctly and refreshes the watchdog.

A quick fix consists in disabling the pushbutton IRQ for a bit of time (200ms) after it has been triggered. This way, it won't be possible to request to wake up while the system is still going to sleep.
A better fix would require to improve the sleep/wakeup workflow so that this race condition becomes impossible.

Still, the watchdog is running, and a long push (7-10s) on the button prevent it from beeing refreshed and the MCU actually resets. I'm not sure why, but it looks like the bootloader is stuck somewhere, maybe also stuck in the device initialization? @lupyuen any idea on how to debug this?

@JF002
Copy link
Collaborator

JF002 commented Sep 13, 2020

I analyzed this a bit further, and this is more complex than previously anticipated: race conditions occurs between SystemTask and DisplayApp : DisplayApp uses the SPI bus to draw on the display, SystemTask decides to put the devices to sleep. There are also async processing (touch IRQ and SPI DMA) that makes all of the more complex.

The bad news is that I managed to reproduce this in 0.7.1 too (just push on the button like crazy, the screen will eventually stay black).

This bug is more likely to happen in 0.8 RC because of the addition of Sleep/Wakeup method on the SPI and TWI, where the race condition put the devices into an incoherent state (the device is disabled when a transaction is running).

I'll try to find an "easy" fix to unblock this 0.8.0 release. Unfortunately, I don't have much time for now to work on PineTime :/

And we should not forget that I think we should have a look at the bootloader too : why cannot it run properly after a watchdog reset when the SPI has been put into an incoherent state ?

@JF002
Copy link
Collaborator

JF002 commented Sep 13, 2020

I pushed a workaround for this issue : https://github.com/JF002/Pinetime/tree/sleep-race-condition-workaround
It seems to prevent total freeze of the firmware, but sometimes, the displays shows garbage (a transaction is most certainly interrupted by the sleep mode).

I'll look for a better solution before releasing this workaround :)

JF002 added a commit that referenced this issue Sep 13, 2020
@JF002
Copy link
Collaborator

JF002 commented Sep 13, 2020

Ok, I think I've found a better solution! It's now in develop, I'll release version 0.8.1 RC for you to test !
EDIT : here is the release : https://github.com/JF002/Pinetime/releases/tag/0.8.1-develop

@dwagenk
Copy link
Author

dwagenk commented Sep 14, 2020

Thanks for all the work you've put into this!

Didn't have the problem appearing on 0.8.1 yet. I'll try a little more ("pushing the button like crazy") and report back if I encounter any problems.

@yukdumboobumm
Copy link

yukdumboobumm commented Sep 18, 2020

I can confirm this still occurs on 0.8.1. Some combination of button and touchscreen but nothing outside of typical user-behavior (I wasn't button mashing). Here's what I remember:

  • Time was incorrect and year was 1795 (or whatever the default is) even though time was synced the night before
  • I reconnected to gadgetbridge
  • I saw the time change, but the year stayed the same
  • I pushed the button while swiping on the screen

Firmware froze, gadgetbridge disconnected. Reset via shorting the pins. I've not been able to recreate it so something in the chronology is probably wrong or unimportant.

@JF002
Copy link
Collaborator

JF002 commented Sep 19, 2020

@yukdumboobumm Thanks for your feedback.
The default time and date are 1 january 1970, which is Unix time Epoch. If the time was reset to this value, it most probably mean that the firmware restarted (due to a crash or empty battery).

I've never noticed that the year would not be updated while the time was correctly sync'ed. If you can reproduce this behavior, could you please create a new issue?

I've tried many time to reproduce the crash you describe by swiping and pushing on the button, with no success. Can you reproduce it easily?

Note that in the meantime, @lupyuen fixed the bootloader. With this new version of the bootloader, even if the InfiniTime freezes or crashes, the bootloader should be able to correctly run and restart the watch.

@JF002
Copy link
Collaborator

JF002 commented Sep 26, 2020

I couldn't reproduce any crash with this version. I close this bug. Do not hesitate to reopen it if necessary.

@JF002 JF002 closed this as completed Sep 26, 2020
tgc-dk pushed a commit to tgc-dk/InfiniTime that referenced this issue Nov 28, 2023
…stall_ressource

littlefs-do: unzip in memory and copy listed resources to SPI raw file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants