-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve I2C Flush (Update) Speed #94
Conversation
Very nice! A few observations from my side:
Re: Faster screen update times might not be a good thing I don't think any sane person would ever rely on the timing of an I2C bus for their application; there are so many things which can go sideways here that it's completely unusable. |
Thanks for the feedback!
Great catch. It looks like most of the divisions are by powers of 2, so I am working on replacing those with bit shift operations.
I definitely agree. My first thought was to use Rust's powerful iterators, but I haven't been having much success with this approach. My other thought was to modify |
Divisions of unsigned integers by powers of 2 are unproblematic. The same for signed integers can sometimes be problematic. It's more the odd divisors I'm worried about... I also tried iterators a while ago without much success. It's probably best to add the windowing directly to lowlevel send functions. |
I have moved the windowing into This approach has removed most of the divisions, except for 2 initial divide-by-8 calculations. As far as performance boost goes, updating a quarter of the display is now 68.5% faster than the original flush (increased from 61.9% previously), updating half of the display is now 37.5% faster (increased from 28.5% previously), and updating the full display is actually 0.5% faster (as opposed to a 10% hit previously). |
Great job, sounds fantastic. @jamwaffles do you have time to give it a try? I'm a bit tied up at the moment. |
Thanks for this PR! The perf numbers are looking good. I'll test and review it more thoroughly when I have some time to get my Blue Pill out.
To satisfy my own curiosity, Rust is smart enough to convert all these to
And if they do, it's not our problem 😂. Regular screen updates should be handled by interrupts or something IMO, so I2C timing isn't much to worry about from the driver's side. |
If you divide by a non-power of two on thumbv6m you'll get this:
on thumbv7m it's a good deal better:
As I said: Divides by power of 2 should be okay for unsigned variables on any platform, anything else depends... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't run it on a physical display yet so this first review is just looking at the code. A few style nits but otherwise ok.
Can I ask you to run cargo fmt
as well? I've opened #95 to check formatting in CI which will break this PR's build.
I'll test on an SPI device this evening. It would be good to:
|
Thanks a bunch for the feedback. I have made the style changes and run I have been testing in |
It also just occurred to me that we should make sure |
The examples use the following pattern to initialise and clear the display: disp.reset(&mut rst, &mut delay).unwrap();
disp.init().unwrap();
disp.fast_flush().unwrap(); Because no pixels have been set in the buffer at this point, Also, here's another example of the odd SPI behaviour. I ported the |
I found an error in my SPI implementation where I had a I've put together a gist with the logic I've been using for benchmarking. For some reason, I haven't been able to get my machine to compile for the Blue Pill, so there's no guarantee that the gist will compile, but the program logic should be sound. If you'd like, I can share the code I'm using to test on my STM32F4 Discovery board. I hadn't thought about support for rotated displays, but I've added support in my latest commit. Please let me know if there is a more Rusty way of writing that logic. As far as |
Yep, that's fixed it! The artifacts are from the power-on state of this display - it seems to just leave some pixels randomly turned on. Using a whole-screen flush as in
Yeah, I think this is a good idea - the artifacts mentioned above would be cleared as before then. I also noticed that the display isn't cleared on restarting the chip, so clear on init is a definite requirement IMO. For example, this is me testing rotations 3 times:
Thanks for putting that up in a gist. It looks like you're checking timings with an oscilloscope or logic analyser hooked up to a pin? Unfortunately I don't have either of those bits of kit so I'll trust your I2C numbers 😁
I'll do another review pass, but rotation results aren't quite right for SPI. I2C works fine strangely enough. This is the |
I've added a call to
No problem! Unfortunately I don't have an oscilloscope or logic analyzer either, so I'm relying on the "super precise" method of hooking up an ESP32 to the GPIO pin and measuring the time between edges. I probably should have mentioned this in my initial comment.
That is quite strange. I'll try to get my hands on a SPI display so that I can do local testing and figure out what's going on. |
Out of curiosity, what |
Mint. I'll do another test run later :) I've just been running it with |
DisplayRotation::Rotate90 | DisplayRotation::Rotate270 => { | ||
((self.max_x | 7).min(width), (self.max_y + 1).min(height)) | ||
} | ||
}; | ||
|
||
self.min_x = width - 1; | ||
self.max_x = 0; | ||
self.min_y = width - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be height
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it is correct as it is since the width and height take into account the display's orientation. Just for fun, I tried swapping width
and height
in line 169 and got a really cool (but unfortunately incorrect) vertically mirrored display.
Pahaha please ignore my dumb ass, the rotation is working fine - the examples set an offset to center the Rust logo in the center of the screen horizontally. This of course changes when rotated by 90/270 degrees so instead of translating by
I think the examples and doc examples can have the initial call to |
Awesome, glad you got it sorted! Just to be clear, should I create a new commit with the initial calls to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't verified on hardware myself but intention and code look good to me.
Yes please
Ah yes please. I forgot about that |
Done and done! |
Sweet! I'll do a final review/physical device test as soon as I can. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a final test pass on both I2C and SPI devices which both seem to work fine. Code looks good too so I'll get this merged. Thanks for your time and hard work!
Released in 0.3.0-alpha.2. Thanks! |
This increases the performance of the I2C interface and possibly the SPI interface as well, but I don't have a SPI-based SSD1306 for verification. The main idea comes from the fact that the SSD1306 has support for writing to a subset of the display memory buffer as if it were contiguous. This means that if you only have a small window within the display to update, you don't have to update the entire display.
This is achieved by keeping track of the locations of the pixels that have changed since the last call to flush or reset. The only values we need to track are the max and min values for the x and y coordinates of each pixel. Once a new call to flush happens (fast_flush in this pull request), the elements of the display buffer on the micro-controller that are within the minimum bounding rectangle of changed pixels are copied to a secondary buffer, the SSD1306 is commanded to work with the window subset of the minimum bounding rectangle, and the secondary buffer is sent to the SSD1306.
I have been developing on the STM32F411-DISCOVERY board since my Blue Pills are still on their way across the ocean so I can't speak to the results with the Blue Pill, but I have seen some truly impressive results with the Discovery board. If you are only updating half of the display size, the fast_flush results in a 28.5% faster update. If you are only updating a quarter of the display, that jumps up to a 61.9% faster update.
The downside of this approach is that the micro-controller does have to do some additional computation while copying the changed pixels into a secondary buffer. This means that as the size of the minimum bounding rectangle approaches the display size, the fast_flush approach actually becomes slower than just updating the entire display. When the update area is greater than 75% of the display area, fast_flush calls flush to do a full display update.
Perhaps the biggest performance jump may come from changing the draw function to take an iterator as a parameter instead of a slice, but I'm not sure how that would work and if that would introduce any breaking changes.
Improvements
Problems