Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 120Hz pulse at the beginning of each move/Classic Jerk not functional (?) #20150

Closed
holgin opened this issue Nov 15, 2020 · 45 comments
Closed

Comments

@holgin
Copy link

holgin commented Nov 15, 2020

Bug Description

I've been trying to write a small piece of software, which in part reads the STEP/DIR signals generated by Marlin.
To properly test that, I bought myself a logic analyzer and I have found something weird.

No matter that are the acceleration, jerk or speed settings, if I have S_CURVE_ACCELERATION or ADAPTIVE_STEP_SMOOTHING enabled or whatever - one, single pulse is always generated 8.33ms (=120Hz) before the rest, "normal" pulses are generated.

Below you can find some pictures:
NASS_SCA_1000mms2_10mms_60mms_120Hz_1,157kHz_2
Series of 10mm moves

NASS_SCA_1000mms2_0mms_60mms_120Hz_1,157kHz_1
Closeup of one of those moves

I have tested Jerk set to 0, 4 and 10mm/s with accelerations 300 and 1000 mm/s2.

I'm running Marlin Bugfix from 08.11.2020, but the same thing was happening on 2.0.3.

I did not try it with Junction Deviation as it is not working very well for me (extruder).

My board is a custom one, based on the LPC1768 uC and stepper motor drivers are TMC2208.

On the other hand, I have noticed that the frequency of subsequent pulses does not change with different Jerk values, which is weird, because I would expect them to immediately jump to frequency which results from Steps/mm and Jerk values.

I've talked about this issue with @sjasonsmith on discord and he said he has seen these weird pulses on a STM32 board.

Configuration Files

config_files.zip

Logs

Measurements from the logic analyzer are also attached. You can open them with Saleae Logic 1.2.18 software.
Naming of the files:
[Adaptive_step_smoothing] [S_Curve_Acceleration] [acceleration] [jerk] [speed]
logs.zip

I can do more tests/measurements if needed.

@holgin
Copy link
Author

holgin commented Dec 13, 2020

So... no hope of getting into it any time soon? I'm willing to do whatever tests and measurements are needed for proper debugging.

@qwewer0
Copy link
Contributor

qwewer0 commented Feb 21, 2021

While I don't fully understand it, but I still think this should kept open.

@qwewer0
Copy link
Contributor

qwewer0 commented Apr 23, 2021

Keeping this open.

@qwewer0
Copy link
Contributor

qwewer0 commented Jun 24, 2021

Lets keep this open.

@holgin
Copy link
Author

holgin commented Aug 25, 2021

From my point of view due to lack of interest this thread can be closed, I switched to Klipper firmware some time ago and I will not be able to test/reproduce anything.

@holgin holgin closed this as completed Aug 25, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Oct 24, 2021
@MarlinFirmware MarlinFirmware deleted a comment from github-actions bot Oct 25, 2021
@MarlinFirmware MarlinFirmware deleted a comment from github-actions bot Oct 25, 2021
@MarlinFirmware MarlinFirmware deleted a comment from github-actions bot Oct 25, 2021
@MarlinFirmware MarlinFirmware deleted a comment from github-actions bot Oct 25, 2021
@MarlinFirmware MarlinFirmware deleted a comment from github-actions bot Oct 25, 2021
@thinkyhead thinkyhead reopened this Oct 25, 2021
@github-actions github-actions bot unlocked this conversation Oct 25, 2021
@thinkyhead thinkyhead added Needs: More Data We need more data in order to proceed Bug: Potential ? and removed stale-closing-soon labels Oct 25, 2021
@MarlinFirmware MarlinFirmware deleted a comment from github-actions bot Oct 25, 2021
@sjasonsmith sjasonsmith reopened this Nov 26, 2021
@sjasonsmith sjasonsmith self-assigned this Nov 26, 2021
@descipher
Copy link
Contributor

Rechecked with a new Hantek LA.

X step pin - ATMEGA2560

image

X step pin - STM32F103VET

image

So this is confirmed on real hardware.

@sjasonsmith
Copy link
Contributor

@descipher if you have ideas to investigate further don’t let me get on your way! I have a bit more time this weekend then will be back to work. I am wondering if this is possibly even correct behavior… it looks odd but maybe that is how jerk/acceleration constraints are met?

@descipher
Copy link
Contributor

The fact that it is present on various hardware/emulation targets means we should be looking in the stepper common code. The probe outputs are from default configs in branch bugfix-2.0.x.
I am working on the STM32 maple pwm issue first, no reason we can't all look for what's causing the premature pulse now that we know it's real.

@sjasonsmith
Copy link
Contributor

I've done some more testing with the simulator.

I've found that the "stray" pulse is always the same time before the rest of the pulse train. It is not influenced by speed, acceleration, or jerk. I see it whether or not I have classic jerk enabled.

Here you see the stray pulse followed by the normal acceleration:
image

The one situation where I do NOT see this, is when movement is slow enough that no acceleration curve is used:
image

If even a small amount of acceleration exists between the first two pulses, the first two are always the same time apart, equating to 1.5 mm/s.
image

@sjasonsmith
Copy link
Contributor

sjasonsmith commented Nov 28, 2021

Well, I found where it is coming from. Understanding whether it is proper behavior or not is a harder question.

void Planner::calculate_trapezoid_for_block(block_t * const block, const_float_t entry_factor, const_float_t exit_factor) {

  uint32_t initial_rate = CEIL(block->nominal_rate * entry_factor),
           final_rate = CEIL(block->nominal_rate * exit_factor); // (steps per second)

  // Limit minimal step rate (Otherwise the timer will overflow.)
  NOLESS(initial_rate, uint32_t(MINIMAL_STEP_RATE));
  NOLESS(final_rate, uint32_t(MINIMAL_STEP_RATE));

In the stepper interrupt code, the timing between the first two steps is controlled by initial_rate, which in this case is set to MINIMAL_STEP_RATE=120. It then does the expected normal acceleration on all the following steps.

I'm not familiar with the planning code or what influences the entry_factor. In my debugging the entry_factor was really small (something like 0.0075), which caused the calculated initial_rate (5) to be less than the minimum (120).

@sjasonsmith
Copy link
Contributor

I need to figure out whether these calculated acceleration/plateau numbers correctly correspond with this pulse train, or if there is an "off-by-one" error and we should be using our acceleration parameters between the first two pulses:

image

image

Right now I'm thinking that initial_rate shouldn't actually be the rate used between the first and second pulses. It is the "incoming" rate from the previous block, and acceleration should begin immediately from that rate.

@descipher
Copy link
Contributor

descipher commented Nov 28, 2021

I did correct a step calculation issue in my laser PR that still not merged. It related to this in some ways.
see this in the current code:

block->decelerate_after = accelerate_steps + plateau_steps;

I had to change the deceleration calc or the planned step counts would be be 1 off.
block->decelerate_after = decelerate_steps + plateau_steps;

Here the decel point is not block->decelerate_after = acceleration_steps + plateau_steps it should be block->decelerate_after = decelerate_steps + plateau_steps;.

e.g. with actuals of accel = 10, plateau = 20, decel = 9 we would get a total step count of 39 but when you do the calc based on the existing code its a total step count of 40

@sjasonsmith
Copy link
Contributor

@descipher I just looked at your changes and left a comment on that PR. The new change doesn't seem right to me, but probably work in some cases. We could very well have several "off by one" errors in this code...

@sjasonsmith
Copy link
Contributor

OK, the current behavior is definitely incorrect. The step generation code assumes that we are actually accelerating during that entire 8.3ms between the first two steps, which causes a huge acceleration spike on the third step, followed by an early arrival at our coasting speed.

With the current code, we don't accelerate or decelerate at all between the first two steps, we only maintain the entry speed.

I've modified the code to actually calculate the step timing according to the desired acceleration. Using that I was able to generate plots comparing the current code to my "hopefully more mathematically correct" modifications. I think the charts speak for themselves.

NOTE 1: This is currently only a prototype. I am probably doing too much math in the stepper ISR, and more work will be needed to solve this properly.

NOTE 2: I'm comparing trapezoid acceleration here, but the same problem exists for S-Curve.

NOTE 3: The timing of velocity reaching zero is an estimate. The Pulses stop at the last non-zero velocity, and I've estimated an extra timestamp to allow me to estimate the acceleration as we come to a stop. This is just a guess, and it looks like we still might be stopping a little too abruptly. We may need to start decelerating one step sooner than even my new changes.

image

@sjasonsmith
Copy link
Contributor

The charts above are from a G0 X0.4 F2000 command, using otherwise default settings in the simulator example.

@holgin
Copy link
Author

holgin commented Nov 28, 2021

I just wanted to say that I'm happy that this is being worked on and it's good to hear that my initial observations were correct!

@descipher
Copy link
Contributor

descipher commented Nov 29, 2021

Definitely concur with your findings, the issue I see is the first step has already occurred in pulse_phase_isr
however block_phase_isr has not calculated the initial_rate interval value on entry so the interval is incorrect for the next step.
We need to calculate the interval for step 1 of the block

e.g.

      if (step_events_completed == 1)  // Calculate initial_rate interval timer value
        acceleration_time = calc_timer_interval(current_block->initial_rate, &steps_per_isr);

image

I still see more issues in the output. The deceleration in this capture is not right, checking further.

@descipher
Copy link
Contributor

descipher commented Dec 2, 2021

I have improved the accel to decel scaling. More testing required and there are some additional quirks to address. This is only one target HAL, others need to be examined to validate the various CPU timer configurations behave consistently.
image

G0F4000X1 = 80 Steps = 1mm
accel = 40 Steps, decel = 39 Steps

@sjasonsmith
Copy link
Contributor

Just a quick update for anyone watching this. @descipher and I have been working together on this and sharing ideas. We have a few ongoing experiments which show promise. There is a lot of intricacy in these bits of code so progress is slow.

So far we have only explored fixing trapezoidal acceleration. Additional work will be needed to apply similar changes to S-Curve.

@thinkyhead
Copy link
Member

That initial 120 now labeled as MINIMAL_STEP_RATE has been in Marlin for a very long time, so I'm not surprised if it became obsolete along the way and is now just a vestigial thing. It is so old that its original purpose is just about lost to obscurity.

@descipher
Copy link
Contributor

That initial 120 now labeled as MINIMAL_STEP_RATE has been in Marlin for a very long time, so I'm not surprised if it became obsolete along the way and is now just a vestigial thing. It is so old that its original purpose is just about lost to obscurity.

Based on my tests if the step rate value is too small issues will surface, such as the timer continuously overflows or the accel formulas fail with 0.

@thinkyhead
Copy link
Member

Is this issue still relevant to the latest code?

@descipher
Copy link
Contributor

Unfortunately I think it is, will verify this weekend and update.

@sjasonsmith
Copy link
Contributor

This might have been fixed with some linear advance rework last year? Was that @tombrazier?

@tombrazier
Copy link
Contributor

Yes, I think I did fix it without realising this issue had already been raised. I certainly remember seeing it when using the simulator. Looking back, I think it was probably commit aaf7d0e in PR #24533. I'd have to look at the code again to understand why that fixed it, though.

@descipher
Copy link
Contributor

image

This is fixed, confirmed with LA.

@sjasonsmith
Copy link
Contributor

Thanks for double-checking @descipher. I’ll close it out!

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants