Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M500 doesn't always save properly when done midprint #416

Closed
buzztiaan opened this issue Mar 10, 2013 · 107 comments
Closed

M500 doesn't always save properly when done midprint #416

buzztiaan opened this issue Mar 10, 2013 · 107 comments

Comments

@buzztiaan
Copy link

M92 E700 during print.
M500 during print.

M503 shows Esteps now being 851 (wtf)

Doesn't always happen, but i now need to doublecheck if the correct settings are saved, and resave them (often)

@ErikZalm
Copy link
Contributor

Access to eeprom is critical when there are interrupts running.
We should block M500 if the printer is moving.

@buzztiaan
Copy link
Author

This also occurs when the printer is not moving.

My arduino is a 2560 r3.

@ErikZalm
Copy link
Contributor

The heater interrupts should also be disabled. But we can disable them for a short time without problems. We can not disable the stepper interrupt during movement.

@boelle
Copy link
Contributor

boelle commented Dec 19, 2014

This one is created about a year ago and there have been a lot of changes, please download the latest copy of marlin and see if the problem is still there. Also you the latest arduino IDE to flash the marlin firmware. If you board files etc only work under old ide upgrade those first so they work under latest IDE.

If you create board files for hardware that are not in the https://github.com/ErikZalm/Marlin/tree/Marlin_v1/ArduinoAddons then please fork marlin and add the files and then create a pull request so that we can get the hardware supported. This will also give an idea what hardware people are using.

@boelle boelle closed this as completed Dec 19, 2014
@buzztiaan
Copy link
Author

not sure what you are doing boelle ...

@boelle
Copy link
Contributor

boelle commented Dec 24, 2014

right now having xmas dinner 😃

Den onsdag den 24. december 2014 skrev buzztiaan [email protected]:

not sure what you are doing boelle ...


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68065935.

@buzztiaan
Copy link
Author

i ment about closing bugs that are not flagged fixed , nor lack any reproduction method

@galexander1
Copy link
Contributor

boelle was just doing much-needed house cleaning. since you say it exists, i'll just re-open it.

regarding the fix, if the problem is interrupts coming during eeprom writes, it really needs to pause printing while it writes. probably not hard to do, but i have to wonder, why do you want to change the eeprom values when printing?

@galexander1 galexander1 reopened this Dec 24, 2014
@buzztiaan
Copy link
Author

tuning esteps during print helps visually inclined tuners ;)

saving during print makes sure the paper you wrote the new esteps on is not the only place this gets stored

@boelle
Copy link
Contributor

boelle commented Dec 24, 2014

oh yes, i closed all those more than year old, then all those with very
little action.

they can be opened again if people respond 😜

Den onsdag den 24. december 2014 skrev buzztiaan [email protected]:

i ment about closing bugs that are not flagged fixed , nor lack any
reproduction method


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68068265.

@boelle boelle added this to the Bug Fixing Round 1 milestone Dec 27, 2014
@Lukelectro
Copy link

I think the folowing way would work to save eeprom settings during print:

Make sure heaters are not currently heating (to prevent overheating) and nothing is moving
disable interupts
save settings to eeprom
re-enable interupts
carry on with what you where doing.

Maybe simply blocking M500 during print (make it return a message like "can't save to eeprom during print") is a better solution, because turning off the heaters and stopping movement during a print, even for a short while, will affect the print. (And keeping the heaters on while disabling temperature control seems a even worse idea)

@boelle
Copy link
Contributor

boelle commented Jan 2, 2015

are you up to doing the fix to get it to work?

@alexborro
Copy link
Contributor

How long does it takes to save the eeprom?? I was thinking about some
miliseconds, nothing more.. I don't think miliseconds will mess heaters
and/or movement...

By the way, I have already saved the eeprom during printing and no problem
at all.

Cheers.

Alex.

2015-01-02 18:50 GMT-02:00 Bo Herrmannsen [email protected]:

are you up to doing the fix to get it to work?


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68561025.

"Não é o mais forte da espécie que sobrevive, nem o mais inteligente. É
aquele que se adapta melhor as mudanças" ( Charles Darwin )

Alex Borro

@boelle
Copy link
Contributor

boelle commented Jan 2, 2015

so for a dummie this means that means we can save while printing without any issue? given it does not take more than a few milis to save

@Lukelectro
Copy link

3.4ms per byte, as per page 22 of the atmega 328 datasheet. That's probably a typical specification, I don't know what's worst case.

quotes from datasheet:

"
Table 5-1.
3.4 ms Erase and Write in one operation (Atomic Operation)
1.8 ms Erase Only
1.8 ms Write Only
"

And here is the reason for these problems:

"
The EEPROM Write Enable Signal EEPE is the write strobe to the EEPROM. When address
and data are correctly set up, the EEPE bit must be written to one to write the value into the
EEPROM. The EEMPE bit must be written to one before a logical one is written to EEPE, other-
wise no EEPROM write takes place. The following procedure should be followed when writing
the EEPROM (the order of steps 3 and 4 is not essential):

  1. Wait until EEPE becomes zero.
  2. Wait until SELFPRGEN in SPMCSR becomes zero.
  3. Write new EEPROM address to EEAR (optional).
  4. Write new EEPROM data to EEDR (optional).
  5. Write a logical one to the EEMPE bit while writing a zero to EEPE in EECR.
  6. Within four clock cycles after setting EEMPE, write a logical one to EEPE.
    The EEPROM can not be programmed during a CPU write to the Flash memory. The software
    must check that the Flash programming is completed before initiating a new EEPROM write.
    Step 2 is only relevant if the software contains a Boot Loader allowing the CPU to program the
    Flash. If the Flash is never being updated by the CPU, step 2 can be omitted. See “Boot Loader
    Support – Read-While-Write Self-Programming, ATmega88 and ATmega168” on page 268 for
    details about Boot programming.

Caution: An interrupt between step 5 and step 6 will make the write cycle fail, since the
EEPROM Master Write Enable will time-out. If an interrupt routine accessing the EEPROM is
interrupting another EEPROM access, the EEAR or EEDR Register will be modified, causing the
interrupted EEPROM access to fail. It is recommended to have the Global Interrupt Flag cleared
during all the steps to avoid these problems.
"

@alexborro
Copy link
Contributor

I have just counted how many write command we have: about 35 commands.. If
the most variable are bytes, so it will take less than 150ms.. lets say
200ms at maximum.

2015-01-02 19:16 GMT-02:00 Lukelectro [email protected]:

3.4ms per byte, as per page 22 of the atmega 328 datasheet. That's
probably a typical specification, I don't know what's worst case.

quotes from datasheet:

"
Table 5-1.
3.4 ms Erase and Write in one operation (Atomic Operation)
1.8 ms Erase Only
1.8 ms Write Only
"

"
The EEPROM Write Enable Signal EEPE is the write strobe to the EEPROM.
When address
and data are correctly set up, the EEPE bit must be written to one to
write the value into the
EEPROM. The EEMPE bit must be written to one before a logical one is
written to EEPE, other-
wise no EEPROM write takes place. The following procedure should be
followed when writing
the EEPROM (the order of steps 3 and 4 is not essential):

  1. Wait until EEPE becomes zero.
  2. Wait until SELFPRGEN in SPMCSR becomes zero.
  3. Write new EEPROM address to EEAR (optional).
  4. Write new EEPROM data to EEDR (optional).
  5. Write a logical one to the EEMPE bit while writing a zero to EEPE in
    EECR.
  6. Within four clock cycles after setting EEMPE, write a logical one to
    EEPE.
    The EEPROM can not be programmed during a CPU write to the Flash memory.
    The software
    must check that the Flash programming is completed before initiating a new
    EEPROM write.
    Step 2 is only relevant if the software contains a Boot Loader allowing
    the CPU to program the
    Flash. If the Flash is never being updated by the CPU, step 2 can be
    omitted. See “Boot Loader
    Support – Read-While-Write Self-Programming, ATmega88 and ATmega168” on
    page 268 for
    details about Boot programming.

Caution: An interrupt between step 5 and step 6 will make the write cycle
fail
, since the
EEPROM Master Write Enable will time-out. If an interrupt routine
accessing the EEPROM is
interrupting another EEPROM access, the EEAR or EEDR Register will be
modified, causing the
interrupted EEPROM access to fail. It is recommended to have the Global
Interrupt Flag cleared
during all the steps to avoid these problems.
"


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68563070.

"Não é o mais forte da espécie que sobrevive, nem o mais inteligente. É
aquele que se adapta melhor as mudanças" ( Charles Darwin )

Alex Borro

@gregrebholz
Copy link
Contributor

I would guess most values are int (2 bytes) or floats (4 bytes). The EEPROM version identifier is a 4 byte character string (3 characters and null). Other arduino sizes are long (4 bytes), double (4 bytes; 8 bytes on the Due), or boolean (1 byte).

Anyway, probably closer to 1/2 second, but the stepper motor code is entirely dependent on receiving regular interrupts trigged by the Arduino's internal timer (handled by ISR(TIMER1_COMPA_vect)) and it will not be able to generate the stepper pulses during this time with interrupts disabled. Presumably the blocks in the move planner just don't get popped, so movement can resume when interrupts are enabled again, but you should at least still see a noticeable halt in movement and it may throw off position... I have no idea how a stepper motor would behave if the pulse train to the coils froze for a while. @ErikZalm did say "We can not disable the stepper interrupt during movement" in the comments above, so I think allowing M500 during movement should get a lot of rigorous testing before any implementation. In the meantime better to block M500 rather than write corrupted data to the EEPROM, which is what is happening today.

@thawkins
Copy link
Contributor

thawkins commented Jan 3, 2015

Can you not cheat and insert an action into the planner queue to write the eeprom contents, that way it should be action-ed between atomic movements. the action can just suspend the heater ISR, if and turn it back on when finished.

@alhirzel
Copy link
Contributor

alhirzel commented Jan 3, 2015

I am not 100% familiar with the code here but it is very important to know all sources of interrupts if we're going to do this... I've listed what I know about, any others maybe buried in libraries?

  • extruder driving in stepper.cpp:TIMER0_COMPA_vect
  • stepper driving in stepper.cpp:TIMER1_COMPA_vect
  • hotend/hotbed control, reading filament width in temperature.cpp:TIMER0_COMPB_vect
  • 4000ms watchdog timer in watchdog.cpp:WDT_vect

@thawkins - if I understand, you're proposing that we forecast times when the stepper and extruder drivers will be in motion for 500ms so we have a safe EEPROM window in which we can disable interrupts, kick the EEPROM then sei() and be on our way. We would leave the watchdog running and disable the temp control for this time. Sound about right?

@alhirzel
Copy link
Contributor

alhirzel commented Jan 3, 2015

If we decide to do that, we have no way of guaranteeing the storage will occur (imagine a particularly crafty set of gcode that never gives us the desired gap). How do we deal with that possibility, slow the print or error to the user?

@thawkins
Copy link
Contributor

thawkins commented Jan 3, 2015

I was suggesting that the eprom write code is run off a handler that is
driven of an action inserted into the motion queue, in that way the write
can only occure between movements and not during a movement, as it will be
syncronised with the stream of movement actions.
On Jan 4, 2015 1:25 AM, "Alex Hirzel" [email protected] wrote:

I am not 100% familiar with the code here but it is very important to know
all sources of interrupts if we're going to do this... I've listed what I
know about, any others maybe buried in libraries?

  • extruder driving in stepper.cpp:TIMER0_COMPA_vect
  • stepper driving in stepper.cpp:TIMER1_COMPA_vect
  • hotend/hotbed control, reading filament width in
    temperature.cpp:TIMER0_COMPB_vect
  • 4000ms watchdog timer in watchdog.cpp:WDT_vect

@thawkins https://github.com/thawkins - if I understand, you're
proposing that we forecast times when the stepper and extruder drivers will
be in motion for 500ms so we have a safe EEPROM window in which we can
disable interrupts, kick the EEPROM then sei() and be on our way. We would
leave the watchdog running and disable the temp control for this time.
Sound about right?


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68602267.

@thawkins
Copy link
Contributor

thawkins commented Jan 3, 2015

i was not suggesting to insert it into a suitable length move, I was
suggesting that it is inserted between moves, which should be atomic.

On Sun, Jan 4, 2015 at 1:27 AM, Alex Hirzel [email protected]
wrote:

If we decide to do that, we have no way of guaranteeing the storage will
occur (imagine a particularly crafty set of gcode that never gives us the
desired gap). How do we deal with that possibility, slow the print or error
to the user?


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68602318.

@alhirzel
Copy link
Contributor

alhirzel commented Jan 3, 2015

But if the print pauses for 200ms (edit: not 500ms) mid-print that may cause undesirable results e.g. with a hot hotend resting in one place, it could disturb surface finish on the piece.

@alhirzel
Copy link
Contributor

alhirzel commented Jan 3, 2015

Yep I think I understand now =] Just trying to think of the consequences

@thawkins
Copy link
Contributor

thawkins commented Jan 3, 2015

I agree that there is potential for a pause on the print and the
introduction of a "hicky", but probably no worse than USB buffering pauses
introduce,.

And if it impacts the print then people should not do it... Their choice.

On Sun, Jan 4, 2015 at 2:30 AM, Alex Hirzel [email protected]
wrote:

Yep I think I understand now =] Just trying to think of the consequences


Reply to this email directly or view it on GitHub
https://github.com/ErikZalm/Marlin/issues/416#issuecomment-68604416.

@alhirzel
Copy link
Contributor

alhirzel commented Jan 3, 2015

We could split the EEPROM write up into multiple operations, e.g. write each multibyte value separately but atomically. This would decrease the delay to <<200ms and reduce the effect I'm worried about. Can also ship it with a disclaimer or leave it disabled by default or something like that like you've said

Last worry is what if EEPROM write fails, any avenue of that compromising the print?

@boelle
Copy link
Contributor

boelle commented Jan 7, 2015

great... then we have 2 testing ATM

@alexborro
Copy link
Contributor

I have tried writing to EEPROM while printing under heavy CPU load and it
was perfect.

My CoreXY bot was moving at 450mm/s which results in really small time
between interrupts. No problems at all.

Cheers.

Alex.

2015-01-07 21:51 GMT-02:00 Bo Herrmannsen [email protected]:

great... then we have 2 testing ATM


Reply to this email directly or view it on GitHub
#416 (comment)
.

"Não é o mais forte da espécie que sobrevive, nem o mais inteligente. É
aquele que se adapta melhor as mudanças" ( Charles Darwin )

Alex Borro

@boelle
Copy link
Contributor

boelle commented Jan 8, 2015

3.... hmm... should we still give it the 2 full weeks and close?

@avluis
Copy link
Contributor

avluis commented Jan 8, 2015

+1 Cannot reproduce here either.

@stv0g
Copy link
Contributor

stv0g commented Jan 8, 2015

We already have 5 people which are not able to reproduce the issue: @avluis @alexborro @arno-millenaar @fmalpartida

I'll try it in a minute.
I would declare this issue as not verified and close it.
Or we could move it to a future milestone.

@daid
Copy link
Contributor

daid commented Jan 8, 2015

My 2 cents. Could be a hardware glitch. We had to replace a few Arduino's at Ultimaker due to the EEPROM not working properly (settings got lost all the time)

(Note that it's most likely a rare glitch, as we've only had a few of them, and we sold lots of Ultimakers)

@boelle
Copy link
Contributor

boelle commented Jan 8, 2015

just move on to the other milestones... and let us thinker what to do here

@boelle boelle removed this from the Bug Fixing Round 1 milestone Jan 8, 2015
@boelle
Copy link
Contributor

boelle commented Jan 8, 2015

created a new label called hard to verify work on it later.... put that one on here and removed it from milestone 1.... so yaaa milestone 1 passed on to the next one... :-D

@thawkins
Copy link
Contributor

thawkins commented Jan 8, 2015

I wasnt able to contribute to milestone one, my test machine is nearing
completion now, hopefully this weekend, so i will probaly will be able to
start contributing in the later milestones. I also run a 3d printing news
aggregation site, called microfabfricator.com which we are adding bunch of
contributer written articles to, one of the articles i would like to add is
how to setup for contribution to Marlin, and how to run the workflow.
On Jan 8, 2015 9:30 PM, "Bo Herrmannsen" [email protected] wrote:

created a new label called hard to verify work on it later.... put that
one on here and removed it from milestone 1.... so yaaa milestone 1 passed
on to the next one... :-D


Reply to this email directly or view it on GitHub
#416 (comment)
.

@boelle
Copy link
Contributor

boelle commented Jan 8, 2015

dont feel bad, my machine is also down, need new bearings and last bit of electronics... will be another month before i can get it running...

food on table is more important :-)

this one will not be closed yet, but we know 5 have tried to reproduce the error and they could not... we move on as not to waste to much time

@thinkyhead
Copy link
Member

Old enough to punt...

@buzztiaan
Copy link
Author

lol again?
'this is too old'
'it still exists'
'but meh its too old'

wtf man :D

On Fri, Feb 20, 2015 at 2:01 PM, Scott Lahteine [email protected]
wrote:

Old enough to punt...


Reply to this email directly or view it on GitHub
#416 (comment)
.

@thinkyhead
Copy link
Member

Open a new issue. Alert the TESTERS topic. Reference this issue in your new issue. (It gets you to the top again!)

@thinkyhead thinkyhead reopened this Feb 20, 2015
@daid
Copy link
Contributor

daid commented Feb 20, 2015

Honestly, who cares? You should not do an M500 during printing. It's silly.

Hard to impossible to reproduce. Lot of effort to fix. Low to almost no practical impact on the user. I would close it and be done with it. Not worth the limited time of the developers that are working on stuff.

@thawkins
Copy link
Contributor

Or just ignore the command if the system is printing and just respond with
a message of "system printing, save ignored"

On Fri, Feb 20, 2015, 22:41 daid [email protected] wrote:

Honestly, who cares? You should not do an M500 during printing. It's silly.

Hard to impossible to reproduce. Lot of effort to fix. Low to almost no
practical impact on the user. I would close it and be done with it. Not
worth the limited time of the developers that are working on stuff.


Reply to this email directly or view it on GitHub
#416 (comment)
.

@nophead
Copy link
Contributor

nophead commented Feb 20, 2015

I think we concluded M500 will work just fine regardless of interrupts if avrlib is recent enough. And I think nobody has managed to recreate the bug except @buzztiaan, who didn't answer my question here: #416 (comment).

@thinkyhead
Copy link
Member

Yeah. Consider M500 during printing to be not a currently supported use-case, and for the time being it can be mentioned in the documentation that it's not guaranteed to work, due to timing issues perhaps.

Meanwhile, if you are able to experiment with your robot, which exhibits the problem, and to determine a way to fix it, please submit a PR.

As the codebase generally improves, with kinks worked out in the interrupt and serial communication code, etc., this problem may get sorted out. If some simple tweak, such as adding a delay, turns out to work around the issue on your system, that would be good for us to know, even if we can't understand why.

My pet obsessions with Marlin have been in UX and code cleanup, so not too deep with timing. I've cleaned up the EEPROM code some, and on its face that code is pretty simple. Some underlying conflict, perhaps with thermocouples or noisy I/O, could boondoggle otherwise good code. So what we ultimately need is a larger sample of users to test it, and I admit it's hard to find testers with enough stake in it to carry it out. To make that easier you could supply some procedure, or a piece of G-Code that demonstrates the issue, and which when run reveals the problem pretty consistently. Even in putting that together you may find some pattern that has been missed.

I understand the complaint that problems like this can linger without resolution. You go to the doctor and tell him when I do this it hurts, and all he does is tell you don't do that, 1000 coins please. Let's hope medicine never goes altogether open source... but the way this project seems to work best is when those who face a problem are able to find the solution. If code is not your thing, you may try to recruit from near and far, someone to assist you. But you can get far just messing around, time permitting.

@boelle
Copy link
Contributor

boelle commented Mar 11, 2015

i will close this one again

  1. as the others say you should not save during print
  2. if you have a recently updated IDE / avrlib

i will also lock it so that it can only be opend by collabs if there is a VERY good reason to do so

@boelle boelle closed this as completed Mar 11, 2015
@MarlinFirmware MarlinFirmware locked and limited conversation to collaborators Mar 11, 2015
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests