Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce M108 cancel wait for heatup/cooldown of the hotend and bed #3611

Merged

Conversation

scalez
Copy link
Contributor

@scalez scalez commented Apr 25, 2016

This G-code is asynchronously handled in the get_serial_commands() parser.

This G-code is asynchronously handled in the get_serial_commands() parser.
@Blue-Marlin
Copy link
Contributor

Blue-Marlin commented Apr 25, 2016

Nice idea, but when a script (g-code program) is running as useless as M112. The buffers are full, no further command is read from the serial until M109/M190 has finished.
Cancelling from the menu can work.

@scalez
Copy link
Contributor Author

scalez commented Apr 25, 2016

@Blue-Marlin the command can still enter the commands_in_queue buffer but it does not need to, this is because L502 in the M109 while loop and L4575 in the M190 while loop calls idle() which calls manage_inactivity which calls get_available_commands() which will in turn call get_serial_commands() and allows a command to enter the commands_in_queue buffer.

The only situation where M108 would not be executed/enter the buffer is if (commands_in_queue < BUFSIZE), therefore the buffer should be properly managed by the host software. If the host software does not properly manage Marlin's buffer then not much can be done on our end. Our Cura LulzBot Edition makes a strong attempt to ensure that the buffer does not overflow and that no commands are waiting in the rx_buffer to enter the commands_in_queue buffer. Imo it is the responsibility of the host to make sure Marlin's buffer is managed correctly.

@Blue-Marlin
Copy link
Contributor

Blue-Marlin commented Apr 25, 2016

if (commands_in_queue < BUFSIZE) get_available_commands();

get_available_commands() is only called when there is some space in the buffer. During running a script usually there is no space.

Test:
Run a program. During waiting for M109 or M190 send your new g-code from the console. What happens?

@scalez
Copy link
Contributor Author

scalez commented Apr 25, 2016

@Blue-Marlin if a script is spamming Marlin with G-code then even the rx_buffer can overflow, we see this all the time and even have a commit to ensure this doesn't happen and notifies the host/script when it occurs, it also prevents truncation.

Take a look at alephobjects@dad9017. alephobjects@65e4ef9 and alephobjects@93b6bff fix a couple mistakes that slipped through the cracks.

@jbrazio
Copy link
Contributor

jbrazio commented Apr 25, 2016

The host only "spams" if it receives an "OK" from Marlin, but in a printing situation, Marlin should have its buffer full as @Blue-Marlin is pointing out.

@scalez
Copy link
Contributor Author

scalez commented Apr 25, 2016

@jbrazio the host should not keep commands_in_queue full because that prevents the potential for certain commands to be executed asynchronously which is often a desired behavior. We have deliberately increased BUFSIZE to 10 in order to ensure that the commands_in_queue always has available elements to store G-code lines which may need to be executed asynchronously. M112 is another G-code which should never be blocked from asynchronously execution because the host improperly clogs commands_in_queue, that's definitely undesirable, even potentially dangerous.

It's the job of the host to maintain a certain number of lines inside commands_in_queue and not necessarily fill it/keep it filled. We may want to create a reserved element for asynchronous commands.

@kakaroto
Copy link

kakaroto commented Apr 25, 2016

@jbrazio @Blue-Marlin :

There are two parts to this, first there's Marlin and how it handles commands it receives (commands_in_queue, serial rx_buffer, async commands, etc..) and the second part is the host software, which sends these commands to Marlin.
Your issue about the commands_in_queue being full and M112 or M108 not being handled is an issue with the host software, it's not related to Marlin. Having that issue should not prevent M108 from being implemented. The fact that we would be unable to cancel a heatup through G-code even though it's an LCD feature, is not consistent with the rest of the firmware and rejecting the patch because it wouldn't work if commands_in_queue is full is not an acceptable (in my opinion) reason for not merging M108.

As for the issue of the commands_in_queue being full, I think that's the responsibility of the host software not to flood the firmware. In Cura LulzBot Edition, the host will keep track of the contents of the commands_in_queue variable, and will make sure not to overflow it, and when M109 or M190 enters the queue, it will stop sending anything to Marlin until that command is done, so that when M109 and M190 is being executed in Marlin, we are 100% sure that there are still at least 2 available spots in commands_in_queue and rx_buffer is empty. Any command (from the gcode or entered manually by the user) will be queued in software and not sent to Marlin unless it's an M108 in which case it is sent right away in order to cancel the heatup (at which point, the host will again fill commands_in_queue with the next 3 commands queued).
This ensures the proper/expected behavior happens and I think that's how the host should behave instead of just being a dump-file-to-serial-redirector and flooding the firmware regardless of what commands are sent to it.

Also, I believe that Pronterface sends only 1 command at a time to Marlin and only sends the next when it receives the 'ok'. The problem is that it will timeout a command after X seconds (5 or 10 or 30, I'm not sure), and will force the next command to be sent. The problem with that is that a M109 or M190 will usually last more than a few seconds, which causes Pronterface to start flooding Marlin with commands every X seconds until it receives the 'ok'. I think that's also a bug in the host because it interprets all commands as being equals. Cura LE has a different timeout depending on the command being executed, a G28, G29, M109 or M190 is expected to take longer than a G1, and the timeout is therefore different for those commands. This allows us to avoid unnecessarily assuming that the 'ok' was lost and causing the behavior that we wanted to avoid in the first place.

All of these are host issues/bugs (that hopefully we fixed properly in Cura LE), and should not affect the decision making in implementing a feature in Marlin, as I don't think it's the job of Marlin to limit itself in order to work around bugs in other people's software.

@thinkyhead
Copy link
Member

thinkyhead commented Apr 25, 2016

I'll try a crack at explaining this too. Sorry if I repeat anything already stated!

The host sends commands as fast as it can, while Marlin buffers commands as long as there is space in the buffer. Once the buffer is full, the host is "held off" and must wait for Marlin to have a space in the queue before its commands can be accepted. The buffer has 4 slots, by default. So the host might send:

M109
G1 X100 Y100 F4000
G1 X110 Y100 F4000
G1 X120 Y110 F4000

…and now the buffer is full, so the host is "held off." Any serial data sent by the host after these commands is still in the serial buffer.

You might think it's still be possible to intercept M108 (or M112) extra early by continuing to accept input from serial even when the buffer is full, but keep throwing it away. However, the way it works now is, if the buffer is full, Marlin just ignores the serial input, so the raw serial buffer gets full, and the lower-level serial interface is what keeps the host at bay, waiting until it can send more bytes.

@thinkyhead
Copy link
Member

thinkyhead commented Apr 26, 2016

P.S. The solution we are looking for is called "handshaking" and depends on both host and firmware following extra protocols.

  • Host sends only one command and waits
  • Marlin responds "got command X" (See ADVANCED_OK)
  • Host sends next command only after acknowledgment of the previous one

Then if the buffer is full, the host will not try to send more than one command. A user could then send M108 or M112 and it would get into the serial buffer ahead of the next command sent as part of the host-based print job.

I don't believe that hosts currently do any waiting for an "Acknowledging" reply before sending the next command.

@jbrazio
Copy link
Contributor

jbrazio commented Apr 26, 2016

@thinkyhead that's the holly grail, but host people don't give a shit about that.. you saw what happen with the "busy" feature. Nevertheless I do believe that they wait for an "OK" before sending the next command.

@thinkyhead
Copy link
Member

@jbrazio Anytime I make changes related to "hosts and protocols" I make sure to involve the host people. Compared to some earlier changes I've made (naïvely), I thought the "busy" feature was received pretty well, and got a lot of good feedback. Which reminds me, we need to document our new codes (like M113) on the reprap wiki.

@thinkyhead
Copy link
Member

Instead of this feature, I suggest making an alteration to the "kill button" feature. When pressed, if heating is active just cancel heating. If heating isn't active (as on the next press) then do kill as usual.

@thinkyhead thinkyhead closed this Apr 26, 2016
@scalez
Copy link
Contributor Author

scalez commented Apr 26, 2016

@thinkyhead kill() is not the functionality we're seeking here though; this G-Code simply adds the ability to stop a wait for target temp to be reached (and doesn't touch the target temp). Please reconsider adding this G-Code.

@jbrazio
Copy link
Contributor

jbrazio commented Apr 26, 2016

For me this type of "immediate" commands require a reformulation of the parser thus it will be something we will need to have in mind when rewriting the new parser.

@kakaroto
Copy link

TL;DR:

  • A bug in the host should not be a reason to reject a patch that is independent of any of that
  • Having a feature available only through LCD and not through a g-code command is not consistent
  • The host communications and ADVANCED_OK discussion, while important, is entirely off topic here
  • Using M112 to replace this is flawed because it is prone to multiple race conditions and will cause headaches.

@thinkyhead This pull request is not effecting how the host and Marlin communicate, it's about adding a new g-code to allow cancelling a heat&wait command (M109/M190). Any issues with the rx_buffer, commands_in_queue buffer, or their parsers is outside the scope of this PR and should be addressed in a new/different PR.
The discussion turned into host/buffer issues when @Blue-Marlin said the new g-code command is useless because of host issues and we covered that those issues are independent of merging this command and that not all hosts are misbehaving, also I said that Marlin should not limit itself by not implementing features because some hosts may not be able to use it properly.
The ADVANCED_OK discussion is off topic here and I've explained in my previous comment how Cura LE already resolved all those hosts/queue issues without modifications to Marlin.

Instead of this feature, I suggest making an alteration to the "kill button" feature. When pressed, if heating is active just cancel heating. If heating isn't active (as on the next press) then do kill as usual.

I disagree because not only will it become a mess with docs (a single command doing 2 different things dependent on the current state) and confusing to users (does it cancel the heatup or just cancel the blocking wait of the command), but it will also cause huge headaches as it is prone to race conditions :

  • What if it's sent after a M109 but before M109 is being processed by Marlin, due to async nature, M112 might get executed before M109 and considered a Kill instead of CancelWait.
  • A user (or software) might see it's still blocked on heating, sends a 'stop waiting' command M112 (maybe because the user is tired of waiting for the bed to reach target temp and just wants to start the print) and before the command reaches Marlin, heating is done, so M112 is suddenly interpreted as a kill.

Please reconsider or at least reopen the issue until it is discussed properly instead of dismissing it because of an entirely unrelated issue (host communication issues).

And in response to @jbrazio

@thinkyhead that's the holly grail, but host people don't give a shit about that.. you saw what happen with the "busy" feature. Nevertheless I do believe that they wait for an "OK" before sending the next command.

I do care about that, and I've spent a considerable amount of time resolving those specific issues from the host side. As for what you do believe the host does, that's right in the case of pronterface, but in the case of Cura, it fills the queue with 3 commands THEN it waits for 'ok' before sending the next, but the command timeout issue is what causes the problems with pronterface and what used to cause problems with Cura (see previous comment). I don't know about the behavior of the other host softwares.

@thinkyhead
Copy link
Member

thinkyhead commented Apr 26, 2016

The host communications and ADVANCED_OK discussion, while important, is entirely off topic here
This pull request is not effecting how the host and Marlin communicate

@kakaroto The code in this PR won't work. Because Marlin (MARLIN!) isn't listening to serial input during M109. Try it and you will see. Hosts have nothing to do with it. They work with Marlin the way they're supposed to work with Marlin.

As @Blue-Marlin correctly pointed out, M112 is similarly broken.

That is why I suggested altering the hardware kill button, because it is one way that you can talk to Marlin while the queue is full. Also, a "Cancel Heating" menu item can bypass the GCode queue. So these are the only options available for a user to cancel waiting for heating.

The machine should of course still wait for heating if the temperature is below minimum extrusion temperature, otherwise it will start to "print" but it will not move the E axis, and it will spit out errors the whole time.

@scalez
Copy link
Contributor Author

scalez commented Apr 27, 2016

Because Marlin (MARLIN!) isn't listening to serial input during M109. Try it and you will see.

@thinkyhead this is actually not the case, referring to my response to @Blue-Marlin:

L4502 in the M109 while loop and L4575 in the M190 while loop calls idle() which calls manage_inactivity() which calls get_available_commands() which will in turn call get_serial_commands() and allows a command to enter the commands_in_queue buffer.

We have tested this many times and it does work. We have a good understanding of how the commands_in_queue buffer interacts with the rx_buffer ring buffer. If there is an available element inside commands_in_queue then the host can fill that empty element with an asynchronous G-Code command, which can be executed right away in the get_available_commands() parser. The host needs to reserve at least one empty element inside Marlin's commands_in_queue buffer for asynchronous execution to work properly; get_available_commands() can always be reached so long as there is an available element inside commands_in_queue, regardless of any wait state(including M600, M109, M190, G29, G28, etc). In principle, there should never be commands waiting inside rx_buffer whilst commands_in_queue is full/congested. Therefore, it is up to the host software to maintain a certain number of lines of G-Code held by Marlin, by this reasoning alone it is safe to assume that the host is also completely capable of ensuring that there is at least one element reserved for asyc behavior.

Please test this PR by sending your machine an M109 R# and/or M190 R# then send G28 then M108. I suggest something like:

M109 R160
G28       ; do this immediately after the M109
M108      ; maybe wait 5 seconds before trying it (don't let the buffer get congested with M105s like Pronterface likes to improperly do)

@thinkyhead
Copy link
Member

That GCode doesn't fill the command buffer. Add a few more commands to the end and test it again.

@scalez
Copy link
Contributor Author

scalez commented Apr 27, 2016

@thinkyhead certainly if the command buffer is full then you cannot perform a command asynchronously, which is exactly why it is the job of the host software to ensure that there is available volatile memory (buffer array element) on the MCU to execute an asynchronous command.
If you sent the following G-Code to Marlin using Cura LulzBot Edition then it would still work:

M109 R160
G28
M105
M105
M105
M105
M105
M105
M105
M105
M105
M108   ; as soon as you send this the G28 and M105s will be processed right away

This is because Cura LE is properly maintaining Marlin'a command buffer, as it should.

The MCU will otherwise just accept any command it is given over USART. Choosing to discard certain commands on the firmware side is not the correct solution. The host software needs to be aware of which commands would be executed asynchronously and only fill the reserved element(s) of the command buffer with any one of those given async G-Codes and not the synchronous ones. This is the proper way to handle this issue, Marlin cannot simply do it alone, that is a huge reason why we return an 'ok' after processing a command to begin with; the host has a responsibility for properly managing the USART line and to not congest the firmware's buffer. This is also why host softwares maintain a buffer of their own as well, so that G-Codes are processed in their anticipated manner.

Marlin needs to be able to execute certain G-Codes asynchronously and M108 is a perfect example of such a desired functionality.

@jbrazio
Copy link
Contributor

jbrazio commented Apr 27, 2016

No.. the host's job is to send commands until the firmware is full, for instance the "look ahead" i.e. the feature to produce a smooth extrusion is dependent on a "full" buffer. The host couldn't care less for firmware's SRAM or "volatile memory" as you called it, and it shouldn't as long as the "OK" commands keep being sent by the firmware. I feel we are iterating over the same point here.

Nevertheless did you even test your code ? Did it behave as you expect ? Which were the test conditions ?

@scalez
Copy link
Contributor Author

scalez commented Apr 27, 2016

@jbrazio currently 'ok' responses are only sent once a command is finished processing, not when it is received, with the exception of when G0/G1 enters the block buffer. Therefore, currently, the host doesn't know if the data was written to the SRAM until it is processed; and even if it did know it was received successfully how can the host force an asynchronous G-Code line to Marlin (and Marlin execute it async) if Marlin's command buffer is congested?

We've tested this several times, since Cura LE properly maintains Marlin's buffer, M108 is guaranteed to cause cancel_heatup = true during an M109/M190.

@jbrazio
Copy link
Contributor

jbrazio commented Apr 27, 2016

Exactly, when the buffer is full, the next "ok" response from Marlin will mean a slot in the buffer just got free.

@jbrazio jbrazio added the S: Don't Merge Work in progress or under discussion. label Apr 27, 2016
@scalez
Copy link
Contributor Author

scalez commented Apr 27, 2016

@jbrazio correct, but if there are no free elements in commands_in_queue then nonsequential commands like M112 and M108 cannot be executed "out of queue" like expected. Cura LE Makes sure there's always a spot open so the expected behavior does happen.

@kakaroto
Copy link

@thinkyhead

@kakaroto The code in this PR won't work. Because Marlin (MARLIN!) isn't listening to serial input during M109. Try it and you will see. Hosts have nothing to do with it. They work with Marlin the way they're supposed to work with Marlin.

Marlin is listening to serial input during M109 and of course this was tried and tested for months in all use cases before the pull request was sent.

@thinkyhead

As @Blue-Marlin correctly pointed out, M112 is similarly broken.
That is why I suggested altering the hardware kill button, because it is one way that you can talk to Marlin while the queue is full. Also, a "Cancel Heating" menu item can bypass the GCode queue. So these are the only options available for a user to cancel waiting for heating.

M112 is not broken if the host acts properly.
And a hardware kill button is only useful when you are next to the machine, and with everyone now using octoprint or other means of remotely controlling their printers, I don't think that relying on a hardware kill button is a solution.

@thinkyhead

@kakaroto The code in this PR won't work. Because Marlin isn't listening to serial input during M109. Try it and you will see.

The code in this PR will work, as long as the host doesn't do something stupid. If for the same reasons you have, M112 wouldn't work, then why didn't you remove that command ?

@thinkyhead

That GCode doesn't fill the command buffer. Add a few more commands to the end and test it again.

Yes, the code @scalez put in his comment wouldn't fill the command buffer, and that's because he didn't know which host you are using, and because you said serial input is not read. But it proves that serial input is actually read and if you download and use the latest version of Cura LE (19.12) from https://www.lulzbot.com/cura and try whichever gcode you want which would actually fill the command buffer, then that would still work because Cura will do the right thing.

@jbrazio

No.. the host's job is to send commands until the firmware is full, for instance the "look ahead" i.e. the feature to produce a smooth extrusion is dependent on a "full" buffer. The host couldn't care less for firmware's SRAM or "volatile memory" as you called it, and it shouldn't as long as the "OK" commands keep being sent by the firmware. I feel we are iterating over the same point here.

The look-ahead feature will still work because the host will fill the command buffer of Marlin during moves, During a M109 or M190, there are no moves, and there is no look-ahead, which is why it will not affect print speed if the buffer is empty for a few milliseconds after heating is done. During moves, we do fill the command queue but we also need to keep at least one slot empty in the queue for async commands, so when M112 is sent, it will be sent and processed asynchronously, so it would work. Read my commit log here from Cura : https://code.alephobjects.com/rCURA397d17dadb66de92fcd01f7155e19224bc411c47
I also feel we are iterating over the same point here.

Nevertheless did you even test your code ? Did it behave as you expect ? Which were the test conditions ?

Yes, the code has been tested many, many times and has been tested for months now without issues in all possible use cases we could come up with. Test conditions are to use Cura LE 19.12+.

@jbrazio
Copy link
Contributor

jbrazio commented Apr 27, 2016

@daid @repetier @foosel anyone wants to weight in ?

@daid
Copy link
Contributor

daid commented Apr 27, 2016

IMHO, I think if you see Marlin as a command processor, this is adding unneeded complexity and the "wait for heatup" logic should be handled in the host.

@scalez
Copy link
Contributor Author

scalez commented Apr 27, 2016

@daid that may be the most ideal way to handle such situations, although any input given to the digital pins would also have to be able to wait in queue (including a GLCD), so the MCU needs to somehow be aware that it is in a wait state.

@jbrazio @thinkyhead @Blue-Marlin: here's a video of a Mini receiving an M109 S200, G28, then spammed many M105s (definitely enough to potentially clog the command buffer). Notice that the expected behavior still occurs when using Cura LE 19.12: https://goblinrefuge.com/mediagoblin/u/kuzmenko/m/marlin-m108-example/

@thinkyhead
Copy link
Member

thinkyhead commented May 30, 2016

"Noone" is not a word. Sometimes I wish it was. The weirder word is "weird" because it places "e" before "i" in contravention of the "usual rule."

@Jnesselr
Copy link
Contributor

If I recall correctly more words break that rule than abide by it.

@thinkyhead
Copy link
Member

If I recall correctly more words break that rule than abide by it.

Indeed. My last name, for example: Lahteine. But then, it is a Finnish name…

@petrzjunior
Copy link
Contributor

Yeah, it realized it should be "no one" but thought it is funny 😄 And god, I made a spelling mistake in comment of spelling... weird :)

@kakaroto
Copy link

So that makes the list: you, David, Gina, Repetier, and Kliment, pretty much… Let me know if I'm forgetting anyone. (Maybe we can make a named group and add all these members…?)

And me please, as I'm one of the main developers of Cura LE!

@jbrazio
Copy link
Contributor

jbrazio commented May 31, 2016

@daid, @foosel, @repetier, @kakaroto you'll receive and invitation to join the organization so I can made you part of a team. What is Kliment's github username ?

As for the developers I suggest @thinkyhead, @AnHardt and @CONSULitAS as our trusty advisor.

@kakaroto
Copy link

@jbrazio That would be @kliment
Thanks.

@jbrazio
Copy link
Contributor

jbrazio commented May 31, 2016

The team is created and people invited, is just a question of accepting the requests. To use the team the tag is @MarlinFirmware/host-software-team

@thinkyhead thinkyhead merged commit e485028 into MarlinFirmware:RCBugFix Jul 4, 2016
@thinkyhead
Copy link
Member

thinkyhead commented Jul 4, 2016

Working or no, here ya go y'all! You can always try M108 and see what happens.

@thinkyhead
Copy link
Member

Incidentally, we are totally open to choosing a different code number. The sooner the better, of course.

@scalez
Copy link
Contributor Author

scalez commented Jul 6, 2016

@thinkyhead I think M108 makes perfect sense due to its proximity to M109. The BFB FW is an old abandoned project intended for a specific, corner-case, proprietary machine. FWIW there are already several conflicts between existing Marlin G-codes and the BFB's set. No other firmware occupies M108.

@thinkyhead thinkyhead mentioned this pull request Jul 8, 2016
@alephobjects-sync-bot alephobjects-sync-bot deleted the M108_cancel_heatup branch July 15, 2016 15:53
@jbrazio jbrazio modified the milestone: 1.1.0 Jul 18, 2016
@a0s
Copy link

a0s commented Sep 17, 2016

Nice idea! Have G29 same "cancel" command? If it does not.. should we have separate cancel commands for each long-time action inside Marlin or one universal command with parameter (describe what we want to stop) ?

@thinkyhead
Copy link
Member

thinkyhead commented Sep 18, 2016

I just added support for M108 to the M0/M1 command (only if no controller is enabled), and it was pretty simple to do. It's a little more involved to do the same for G29 but not terribly difficult. It would only be able to interrupt in-between grid points, not right in the middle of a probe. The extra work comes in deciding whether to disable and stow the probe or leave it deployed when interrupting the process. And it requires EMERGENCY_PARSER to be enabled, which adds some overhead to the serial input processing. It's not like this would be enabled by default. In the long run, G29 should probably be cancel-able from the LCD also.

@scalez
Copy link
Contributor Author

scalez commented Sep 19, 2016

@thinkyhead @AnHardt what about handling M18/M84 in emergency_parser(unsigned char c)? Maybe to minimize the overhead we could make macros for each priority command? That way the user can define which commands should be priority.

@thinkyhead
Copy link
Member

thinkyhead commented Sep 20, 2016

I vote no on M18/M84. For one thing, the stepper ISR or planner would immediately power the steppers right up again.

@scalez
Copy link
Contributor Author

scalez commented Sep 20, 2016

@thinkyhead ah you're right, I just realized we have M410; wouldn't calling M410 in the middle of G29 stop probing since it calls Stepper::quick_stop()?

@thinkyhead
Copy link
Member

thinkyhead commented Sep 21, 2016

G29, like all other GCode commands, is blocking until it completes. No other GCode commands coming from the USB input or via the SD card can be processed until the G29 command completes. Hence, one would need to make G29 interruptible via M108 by adding a new flag, or expanding/renaming the wait_for_heatup flag so that G29 may be interrupted in the same manner as the blocking heatup commands — only possible with the EMERGENCY_PARSER feature enabled.

@scalez
Copy link
Contributor Author

scalez commented Sep 21, 2016

@thinkyhead ah yes the need for a flag to escape from gcode_G29()'s nested for loop. Stepper::quick_stop() would stop any movement happening from do_blocking_move_to_* calls in probe_pt(const float, const float, bool, int) or run_z_probe() though.

I'm not sure if the same flag should be shared between G29 and M109/M190; I suppose that depends if there's an instance where someone would prefer to escape from one blocking command and not another.

@thinkyhead
Copy link
Member

thinkyhead commented Sep 21, 2016

Get to know the Marlin architecture. You can't have more than one command running at once. Marlin will either be inside of G29 or it will be inside of M109 or it will be inside of M190. It's a single-thread application with one interrupt service routine for stepping and one ISR for managing sensors. It's not a state machine. Commands that need to wait for some state simply call idle() or thermalManager.manage_heater() in a loop, without ever returning from the command handler function.

@scalez
Copy link
Contributor Author

scalez commented Sep 21, 2016

@thinkyhead I understand it couldn't inject in the middle of handing an ISR such as ISR(TIMER1_COMPA_vect). Although between servicing the timer interrupt, calls to Stepper::quick_stop() will flush the planner's block buffer. The same as is done in lcd_sdcard_stop() function. Therefore when this hypothetical GCode enters the rx_buffer (via the USART ISR) and is handled via the EMERGENCY_PARSER it would call Stepper::quick_stop() and would stop whatever movement is currently in the block buffer. Then the flag would trigger G29 to escape its routine altogether.

Try selecting the "Stop print" whilst in the middle of G29 (via SD/SPI).

@Blue-Marlin
Copy link
Contributor

IMHO quick_stop will not help a lot. The most time G29 is waiting for a move to finish. If the motion Q is emptied, G29 happily will continue with the next move - and a corrupted position. M108 and asking the flag in G29, at a couple of places, has a much better chance to succeed.
But is things go really wrong - take the big red button.

@scalez
Copy link
Contributor Author

scalez commented Sep 21, 2016

@Blue-Marlin you could use a flag to escape from probe_pt(const float, const float, bool, int), run_z_probe(), and the nested for loop inside gcode_g29(). I agree that the position would be corrupted, this would just be a possible method for stopping any current movement in the block buffer. I'm not necessarily promoting this idea, just brainstorming.

@thinkyhead
Copy link
Member

thinkyhead commented Sep 22, 2016

I understand it couldn't inject in the middle of handing an ISR…

I wasn't talking about the ISR, but the main thread and the command parser. When a command like G29 is being processed it completely blocks the GCode command stream. Any GCode-based concept can only add to the end of the command buffer, unless EMERGENCY_PARSER is there to look for special commands or special bytes and set a flag. The EMERGENCY_PARSER itself runs in an interrupt, so flag-setting is about the only safe thing it can do.

calls to Stepper::quick_stop() will flush the planner's block buffer.

By setting a flag that, when set, causes the ISR to flush itself.

EMERGENCY_PARSER … would call Stepper::quick_stop() and would stop…

Bad Idea!™ Setting a flag is safe. But calling a function from the RX interrupt —like Stepper::quick_stop— which is only meant to be handled in the main thread would probably cause the firmware to crash. What if it was called in the middle of Planner::buffer_line, for example?

Try selecting the "Stop print" whilst in the middle of G29 (via SD/SPI).

Yes, the LCD can take direct actions, but if an LCD menu item is only injecting a GCode command (as some of them do) they just go into the queue, and will not interrupt other GCode commands. And actually, the "Stop Print" command doesn't cause G29 to exit, so results may be unpredictable in that case. Also, try stopping G29 from the LCD when it wasn't started from an SD file. You won't find a command that does this. There could be one, but it would only set the same flag being proposed for M108. In the same manner, you could add a command to the LCD menus that stops waiting for heatup by calling a function that clears the wait_for_heatup flag.

the position would be corrupted

I implemented methods to recover the current position directly from the stepper counters after a forcible interruption. Anyway, I believe all the functions used by G29 now use current_position to track moves sent to the planner. So if G29 was interrupted (by a flag) the position would only be lost (temporarily) if moves were forcibly halted in the middle. If steps are lost because a move brakes too suddenly, we can't do anything about that. It would be interesting to try to implement a feature to cause a proper deceleration and end moves more elegantly to prevent lost steps. But doubtless, this special option would add more overhead to the stepper ISR, all the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.