-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BMP 4x loading speed improvement #31
base: main
Are you sure you want to change the base?
Conversation
Move class from example to library.
I don't have the hardware but I was curious about the ChatGPT update so I looked over the code. The changes to the algorithm make sense to me. You're trading off memory usage for a more efficient file read method. Do you think the display could be used on more memory constrained boards and/or attempt to display larger bitmap files that wouldn't fit into memory? If so, I was thinking adding the option to use the old line by line method might be of value. I though it would be fun to try and add the option without adding two complete code branches. I'd be curious if this works and if you think it would be useful 😁 BMP Classclass BMP:
"""
Optimized with ChatGPT by DJDevon3
https://chat.openai.com/share/57ee2bb5-33ba-4538-a4b7-ec3dea8ea5c7
Draw Bitmap Helper Class (not hardware accelerated)
:param str: filename BMP filename
:param int colors: BMP color data
:param int data: BMP data
:param int data_size: BMP data size
:param int bpp: BMP bit depth data
:param int width: BMP width
:param int height: BMP height
:param int read_header: BMP read header function
"""
class _BmpParse(object):
def __init__(self,file_name,fast):
self.file_name = file_name
self.fast = fast
def __enter__(self):
if not self.fast:
self.file = open(self.file_name,'rb')
return self.file
else:
return None
def __exit__(self, *args):
if not self.fast:
self.file.close()
def __init__(self, filename):
self.filename = filename
self.colors = None
self.data = None
self.data_size = 0
self.bpp = 0
self.width = 0
self.height = 0
self.read_header()
def read_header(self):
"""Read file header data"""
if self.colors:
return
with open(self.filename, "rb") as bmp_file:
bmp_file.seek(10)
self.data = int.from_bytes(bmp_file.read(4), "little")
bmp_file.seek(18)
self.width = int.from_bytes(bmp_file.read(4), "little")
self.height = int.from_bytes(bmp_file.read(4), "little")
bmp_file.seek(28)
self.bpp = int.from_bytes(bmp_file.read(2), "little")
bmp_file.seek(34)
self.data_size = int.from_bytes(bmp_file.read(4), "little")
bmp_file.seek(46)
self.colors = int.from_bytes(bmp_file.read(4), "little")
def draw(self, disp, x=0, y=0, fast=True, debug=False):
"""Draw BMP"""
if debug:
print("{:d}x{:d} image".format(self.width, self.height))
print("{:d}-bit encoding detected".format(self.bpp))
line_size = self.width * (self.bpp // 8)
if line_size % 4 != 0:
line_size += 4 - line_size % 4
if fast:
with open(self.filename, "rb") as bmp_file:
bmp_file.seek(self.data)
pixel_data = bmp_file.read()
with self._BmpParse(self.filename,fast) as bmp_file:
if not fast:
bmp_file.seek(self.data)
disp.set_window(x, y, self.width, self.height)
line_start = 0
line_end = line_size
for line in range(self.height):
current_line_data = b""
if fast:
line_start = line * line_size
line_end = line_start + line_size
else:
pixel_data = bmp_file.read(line_size)
for i in range(line_start, line_end, self.bpp // 8):
if (line_end - i) < self.bpp // 8:
break
if self.bpp == 16:
color = self.convert_555_to_565(
pixel_data[i] | pixel_data[i + 1] << 8
)
if self.bpp in (24, 32):
color = self.color565(
pixel_data[i + 2], pixel_data[i + 1], pixel_data[i]
)
current_line_data = current_line_data + struct.pack(">H", color)
disp.setxy(x, self.height - line + y)
disp.push_pixels(current_line_data)
disp.set_window(0, 0, disp.width, disp.height)
@staticmethod
def convert_555_to_565(color_555):
"""Convert 16-bit color from 5-5-5 to 5-6-5 format"""
r = (color_555 & 0x1F) << 3
g = ((color_555 >> 5) & 0x1F) << 2
b = ((color_555 >> 10) & 0x1F) << 3
return (r << 11) | (g << 5) | b
@staticmethod
def color565(r, g, b):
"""Convert 24-bit RGB color to 16-bit color (5-6-5 format)"""
return ((r & 0xF8) << 8) | ((g & 0xFC) << 3) | (b >> 3) Edit: I was scanning the code and realized that the internal class _BmpParse was going to have a problem in "Fast" mode. I've edited the exit so I think it will work now. |
@RetiredWizard The original demo was designed for a Feather M4 so yes it can work. The M0 will struggle but it's doable. At the time this library was coded the M4 was probably top of the line with Circuit Python (other than Teensy). S2 and S3 didn't even exist yet. Now that we have more powerful microcontrollers my thought is that it's time to revisit these larger displays. They are definitely usable on an S3. If I can find a way to combine the native RA8875 register methods with images and layers it's possible it could be faster than displays half its size due to the hardware acceleration. The hardware acceleration is almost instantaneously updated for vector graphics or text using the native methods... easily 60fps. Under the right conditions, hooking into hardware acceleration, the S3 can drive vectors on this display faster than displayio can on a 128x64 OLED. Melissa laid the ground work on this display driver but it hasn't been revisited in years. One of the things that made me decide to try it now is the port of the 7" Sunton ESP32-S3 board that uses a similar display which was recently added to the supported boards. I like the changes you made and will likely do another commit to fold that in so that the M4 can still work with it. I didn't think about the line chunks and ram use, you're absolutely right and the old method should be left in. Thank you for doing that! |
Thanks! I'm sort of curious if the private class I created to handle the with context manager is going to work as I expect. I did try and do some testing (although without the hardware I don't get far) but the testing did find that I left off the "self." in the |
After playing around with it for a bit figured out that it comes down to moving the function into the library and giving it a method for self.color565. Without the self it reaches outside to the top of the script for color conversion. I don't think there is a RAM difference though I have no idea how I'd test that. The one line for Unfortunately the speed improvement only really works with the BMP test (rasterized). All of the native register methods with vector are already as fast as they can be. @RetiredWizard I tried getting your code to work but couldn't. So I went the long way and added an argument to the function with an if/else to test the new method and old method. When adding self as described above the old method instantly becomes 4 times faster. There's almost no difference in using the for loop vs the with statement. It's the way it was reaching farther for color conversion that was keeping it slow. The way you wrote your context manager is pretty close to the way I was also testing it. It's worth noting that the sheer amount of disabled pylint errors in this library is also what hid the issue from being discovered... specifically the ignore invalid names. They were ignored because there are too many variables named, x1, x2, x3, y1, y2, y3, etc... and it would be hours of work to rename them all properly... but it also ignored an actual error for color565 that would have lead to the optimization as a natural course of making pylint happy. |
No idea how that went unnoticed for so long. The blue and green RGB values were incorrect in the simpletest. |
Excellent sleuthing 😁 I'm surprised that there's apparently no speed advantage to reading the BMP file with a single read statement rather than one line at a time. I still think reading the entire BMP file into a variable could be a memory problem depending on the size of the image and/or the available resources on the board being used. But maybe that's a problem to be resolved when it's encountered.
Have you tried printing gc.mem_free() at the start and end of your test programs? I don't really understand the under the hood workings of CircuitPython memory but mem_free() is what I use to get a general feel for how much memory my programs are using. |
@RetiredWizard I did play around with attempting to feed it more bmp read lines in chunks and it was actually slower so there's no additional chunking improvements that can be done. The speed of the color conversion is the limiting factor. I have an idea to attempt to use the native vector pixel/line to read a bmp pixel by pixel and display it. No clue if that will work but it might be a way to tie into the native method for hardware accelerated vector to convert a rasterized image to vector... in theory. The hardware accelerators are all vector based including the built-in fonts. They work about 100x faster so if I can fool it into building a rasterized image using its native vector methods... we'll see. |
The startup time isn't as fast but the draw time is. Allows BMP class and other file types in the future to be expanded upon more easily. Updated BMPtest example with the new imports and function calls.
Since pylint doesn't like any file over 1000 lines we'll just move the entire BMP class to its own file and import that. Have confirmed it works as intended with the new changes. It also makes it easier to expand upon the class and add other class file types in the future like jpeg, gif, etc.. |
Figured out how to read a single pixel and will push a PR for it shortly. I wrote a playground note on it here. |
For a screensaver
ability to read a pixel color from the display at x,y coordinates and return as RGB values. Read data16 function makes it easier to chunk 16-bit color to/from register. Added some missing registers and register descriptions.
set debug default to false
Move BMP class from example to library. Because this display requires built in registers and is not compatible with displayio it needs its own graphics library. This display cannot be integrated with displayio without considerable effort above my skill level. Uses hardware acceleration native to the RA8875 chip for vector primitives and text (with fonts stored on the chip).
BMP loading is not hardware accelerated but I managed to make BMP's load about 4x faster with the help of ChatGPT.
Video demo showing previous bmptest (approximately 17 seconds) vs updated bmptest (approximately 4 seconds).
Adafruit RA8875 driver board with Adafruit bare 40-pin 7" Touch TFT display running on UM FeatherS3 (N16R8).
RA8875_BMP_Loading.mp4