Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphics Corrupted on 12864 LCD with MEGA2560/RAMPS/12864LCD #21495

Closed
descipher opened this issue Mar 31, 2021 · 32 comments
Closed

Graphics Corrupted on 12864 LCD with MEGA2560/RAMPS/12864LCD #21495

descipher opened this issue Mar 31, 2021 · 32 comments

Comments

@descipher
Copy link
Contributor

Did you test the latest bugfix-2.0.x code?

Yes, and the problem still exists.

Bug Description

Random bits draw on status and Menu's

Bug Timeline

On or slightly before Mar 29/2021

Expected behavior

No random bits flying around.

Actual behavior

Random bits flying around.

Steps to Reproduce

Clone buxfix-2.0.x
Using RAMPS MEGA2560 based controller:
edit Configuration.h to include #define REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER

Version of Marlin Firmware

bugfix-2.0.x

Printer model

N/A

Electronics

No response

Add-ons

No response

Your Slicer

No response

Host Software

No response

@descipher
Copy link
Contributor Author

IMG_0672

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Mar 31, 2021

Hi. Could you please provide your config files? Otherwise, diagnostics will be very difficult.

@descipher
Copy link
Contributor Author

Configuration.zip
Default with REPRAP_FULL_GRAPHICS

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Mar 31, 2021

I guess this could be a delay issue similar to #21264 and fixed for different boards in PRs #21346 and #21425. I would suggest to try setting

#define BOARD_ST7920_DELAY_1 DELAY_NS(125)
#define BOARD_ST7920_DELAY_2 DELAY_NS(63)
#define BOARD_ST7920_DELAY_3 DELAY_NS(125)

in Configuration.h and see if that fixes it.

@descipher
Copy link
Contributor Author

It does correct the issue.

@FanDjango
Copy link
Contributor

I guess we can now see where this is going. Time to visit all configs, maybe?

@descipher
Copy link
Contributor Author

In Configuration_adv.h, this could do them all ..
#if ENABLED(U8GLIB_ST7920)
// Enable this option and reduce the value to optimize screen updates.
// The normal delay is 10µs. Use the lowest value that still gives a reliable display.
//#define DOGM_SPI_DELAY_US 5
//#define LIGHTWEIGHT_UI
#if ENABLED(LIGHTWEIGHT_UI)
#define STATUS_EXPIRE_SECONDS 20
#endif
#endif
#if ARDUINO_AVR_MEGA2560
#define BOARD_ST7920_DELAY_1 DELAY_NS(125)
#define BOARD_ST7920_DELAY_2 DELAY_NS(63)
#define BOARD_ST7920_DELAY_3 DELAY_NS(125)
#endif
#endif

@descipher
Copy link
Contributor Author

Nope there are many that it would miss based on conditionals ...

@ellensp
Copy link
Contributor

ellensp commented Apr 1, 2021

This is only required on really bad slow lcd's
Not needed by everyone.

@descipher
Copy link
Contributor Author

This is only required on really bad slow lcd's
Not needed by everyone.

The ST7920 LCD driver is not bad it's just old, but there are 10's of thousands of them out there in active use.

@ellensp
Copy link
Contributor

ellensp commented Apr 1, 2021

its not the driver. its the actual lcd module. My REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER on ramps still works as it always has, without needing delays when on a mega2560. They say they are the same chipset.
I suspect some of these clones are just not up to spec. More junk from the reject bin sold to unsuspecting buyers.

@descipher
Copy link
Contributor Author

The scope is the only real way to know if the signal time is not correct or marginally close. I can set it up a full test, but in my experiences it's not usually the driving chip .. it's the driving signal and noise. Nothing changed on the hardware I have, then suddenly the display goes south after some AVR code changing along with lots of others hitting the same behavior. I think it's more likely to be tight code timing issues than a bad display LCD chip. I have only once in my career observed poor tolerances in a batch of LCD chips and it was intermittent in nature. Based of a quick check to the clock specs at https://www.crystalfontz.com/controllers/Sitronix/ST7920/ I am betting on marginal timing coming out of the code.

@thisiskeithb
Copy link
Member

I can't remember which PR or when the change happened, the delays were sped up by something like 50%.

This would mean all the current LCD timings are incorrect (some worse than others) since those weren't updated & likely why there are reports & PRs trickling in to update LCD timings.

@ellensp
Copy link
Contributor

ellensp commented Apr 1, 2021

wasn't that was for stm32, not avr

@thisiskeithb
Copy link
Member

wasn't that was for stm32, not avr

It didn't apply as broadly as I had hoped: #20901

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Apr 1, 2021

As detailed in this post, it looks like a switch from GCC 5.4 to 7.3 in the platformio toolchain for AVR also changed the machine code, resulting in some delays being one cycle shorter than before. Also, based on what I gathered from that thread, the original delays on some AVR boards were tight even before that change in the toolchain.

I would be open to make a PR for the RAMS 1.4 board affected here, but I am not sure where the default delays for this are set. Does this case apply for the board

#elif F_CPU == 16000000
#define CPU_ST7920_DELAY_1 DELAY_NS(0)
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(63)

?

@descipher
Copy link
Contributor Author

Did some research and its clear that the timing is not correct for the mega2560. If the CPU is at the common 16MHZ one instruction cycle would be 62.5ns and with SPI that means we are very marginal when pushing a data byte. The spec for the ST7290 is a minimum 400ns SCLK cycle, we are frequently below that minimum. The fact that worked in the past is based on the chip working faster than spec in most cases!

image

#elif MB(RAMBO)
#define CPU_ST7920_DELAY_1 DELAY_NS(0) <- set CS LOW and Delay 0 = 62.5ns (I/O op)
#define CPU_ST7920_DELAY_2 DELAY_NS(0) <- immediately send data, ok this can work the spec is 40ns we used 62.5ns
#define CPU_ST7920_DELAY_3 DELAY_NS(0) <- 62.5ns later set CS high for a total of 190.5ns

test condition and loop = 62.5ns
total cycle is 251ns!

This is code for the u8glib_rrd_AVR and will likely fail with many ST7920 chips.

So the theory is we can add a delay of 150ns on any of 1,2 or 3 and it should function correctly at that point.

So that's what I tested:

Existing code:

#elif F_CPU == 16000000
#define CPU_ST7920_DELAY_1 DELAY_NS(0)
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(63) failed

test 1
#elif F_CPU == 16000000
#define CPU_ST7920_DELAY_1 DELAY_NS(150) passed
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(0)

test 2
#elif F_CPU == 16000000
#define CPU_ST7920_DELAY_1 DELAY_NS(0)
#define CPU_ST7920_DELAY_2 DELAY_NS(150) passed
#define CPU_ST7920_DELAY_3 DELAY_NS(0)

test 3
#elif F_CPU == 16000000
#define CPU_ST7920_DELAY_1 DELAY_NS(0)
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(150) passed

The issue is that we are not within the documented spec of 400ns now that the compiler has improved optimizations.

@FanDjango
Copy link
Contributor

FanDjango commented Apr 1, 2021

If you look at all the conditionals and different values, one thing becomes clear: a DELAY_NS(100) for example is not the same amount of nanoseconds (i.e. 100 one would hope) on each board. In that case the "_NS" moniker is misleading.

If on the other hand, the delays actually ARE nanoseconds (reasonably accurate or minumums), they could be the same on each board.

I know this came about historically, but there is a chance to get this cleaned up.

@descipher
Copy link
Contributor Author

I believe thats correct. The resolution is 62.5ns in this case if the value is 63 you will get a delay of 125ns , in the tests I performed we are getting 186.5ns.

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Apr 1, 2021

150 ns would effectively lead to 187.5 ns delays + communication command times. If we want to stay in spec, we would rather use

#define CPU_ST7920_DELAY_1 DELAY_NS(125)
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(125) 

right? That should add up to 437.5 ns per full cycle (assuming each command between the delays takes 62.5 ns). Anything problematic with this?

@descipher
Copy link
Contributor Author

descipher commented Apr 1, 2021

Based on the ST7920 specs and how the compiler works with the DELAY_NS() macro there should be no need to condition on what board or F_CPU is used. The code should just define the delays based on the target device spec. One delay definition for all of them.

e.g.
#if ENABLED(U8GLIB_ST7920)
#define ST7920_DELAY_1 DELAY_NS(60)
#define ST7920_DELAY_2 DELAY_NS(40)
#define ST7920_DELAY_3 DELAY_NS(300)

...

@descipher
Copy link
Contributor Author

descipher commented Apr 1, 2021

I have an ST32 (MKS) board handy, It should work with the ST7920, I will give it a poke later and test the theory.

@descipher
Copy link
Contributor Author

150 ns would effectively lead to 187.5 ns delays + communication command times. If we want to stay in spec, we would rather use

#define CPU_ST7920_DELAY_1 DELAY_NS(125)
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(125) 

right? That should add up to 437.5 ns per full cycle (assuming each command between the delays takes 62.5 ns). Anything problematic with this?

I don't see any issues with it.
On my other idea of one config for all, It may not be safe and with all the fast emulators it would be a constraint to some configs. I will test the 125's, should be fine of course but I will test it anyway.

@descipher
Copy link
Contributor Author

Passes OK on MEGA2560/RAMPS1.4/12864LCD

@ramiropolla
Copy link
Contributor

In any case I suggest writing ST7920_SWSPI_SND_8BIT() in assembly so we have greater control of the timing, at least for AVRs. This code might change with another compiler update or different flags, and the timing will be off again. If we write it in asm, we're sure that it won't change anymore.

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Apr 2, 2021

@ramiropolla Good point. It would also be a somewhat redundant re-implementation of the delay functions, but I get the reasoning.

If assembler is requested, I can't supply the PR because I don't know the language.

@descipher
Copy link
Contributor Author

@ramiropolla Good point. It would also be a somewhat redundant re-implementation of the delay functions, but I get the reasoning.

If assembler is requested, I can't supply the PR because I don't know the language.

I think it would not be in the best interests of the community to do one driver in assembler to have more control over the small potentially negative impact of compiler improvements. Simplicity would keep the code maintained and serviceable. An error in the previous code is the root cause and not the compiler improvements fault.

@ramiropolla
Copy link
Contributor

An error in the previous code is the root cause and not the compiler improvements fault.

How did you get to that conclusion? What is the error in the previous code?

(I genuinely want to know. did you read this comment?)

@descipher
Copy link
Contributor Author

I checked the timing on my scope is how I know the code is in error. The original code does not follow the specifications for the minimum cycle time of SCLK. Even without the compiler improvements the original code generated an SCLK time shorter than the spec.

@thisiskeithb
Copy link
Member

Can this be closed now that a couple related PRs have been merged?

@descipher
Copy link
Contributor Author

The final required delay should be

#define CPU_ST7920_DELAY_1 DELAY_NS(125)
#define CPU_ST7920_DELAY_2 DELAY_NS(0)
#define CPU_ST7920_DELAY_3 DELAY_NS(188)

This is now resolved.

@github-actions
Copy link

github-actions bot commented Jun 9, 2021

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants