BMP 4x loading speed improvement #31

DJDevon3 · 2024-04-29T05:43:53Z

Move BMP class from example to library. Because this display requires built in registers and is not compatible with displayio it needs its own graphics library. This display cannot be integrated with displayio without considerable effort above my skill level. Uses hardware acceleration native to the RA8875 chip for vector primitives and text (with fonts stored on the chip).

BMP loading is not hardware accelerated but I managed to make BMP's load about 4x faster with the help of ChatGPT.

Video demo showing previous bmptest (approximately 17 seconds) vs updated bmptest (approximately 4 seconds).

Adafruit RA8875 driver board with Adafruit bare 40-pin 7" Touch TFT display running on UM FeatherS3 (N16R8).

RA8875_BMP_Loading.mp4

Move class from example to library.

RetiredWizard · 2024-04-29T16:17:27Z

I don't have the hardware but I was curious about the ChatGPT update so I looked over the code. The changes to the algorithm make sense to me. You're trading off memory usage for a more efficient file read method.

Do you think the display could be used on more memory constrained boards and/or attempt to display larger bitmap files that wouldn't fit into memory? If so, I was thinking adding the option to use the old line by line method might be of value. I though it would be fun to try and add the option without adding two complete code branches. I'd be curious if this works and if you think it would be useful 😁

BMP Class

class BMP:
    """
    Optimized with ChatGPT by DJDevon3
    https://chat.openai.com/share/57ee2bb5-33ba-4538-a4b7-ec3dea8ea5c7
    Draw Bitmap Helper Class (not hardware accelerated)
    :param str: filename BMP filename
    :param int colors: BMP color data
    :param int data: BMP data
    :param int data_size: BMP data size
    :param int bpp: BMP bit depth data
    :param int width: BMP width
    :param int height: BMP height
    :param int read_header: BMP read header function
    """

    class _BmpParse(object):
        def __init__(self,file_name,fast):
            self.file_name = file_name
            self.fast = fast
            
        def __enter__(self):
            if not self.fast:
                self.file = open(self.file_name,'rb')
                return self.file
            else:
                return None
                
        def __exit__(self, *args):
            if not self.fast:
                self.file.close()

    def __init__(self, filename):
        self.filename = filename
        self.colors = None
        self.data = None
        self.data_size = 0
        self.bpp = 0
        self.width = 0
        self.height = 0
        self.read_header()

    def read_header(self):
        """Read file header data"""
        if self.colors:
            return
        with open(self.filename, "rb") as bmp_file:
            bmp_file.seek(10)
            self.data = int.from_bytes(bmp_file.read(4), "little")
            bmp_file.seek(18)
            self.width = int.from_bytes(bmp_file.read(4), "little")
            self.height = int.from_bytes(bmp_file.read(4), "little")
            bmp_file.seek(28)
            self.bpp = int.from_bytes(bmp_file.read(2), "little")
            bmp_file.seek(34)
            self.data_size = int.from_bytes(bmp_file.read(4), "little")
            bmp_file.seek(46)
            self.colors = int.from_bytes(bmp_file.read(4), "little")

    def draw(self, disp, x=0, y=0, fast=True, debug=False):
        """Draw BMP"""
        if debug:
            print("{:d}x{:d} image".format(self.width, self.height))
            print("{:d}-bit encoding detected".format(self.bpp))

        line_size = self.width * (self.bpp // 8)
        if line_size % 4 != 0:
            line_size += 4 - line_size % 4

        if fast:
            with open(self.filename, "rb") as bmp_file:
                bmp_file.seek(self.data)
                pixel_data = bmp_file.read()

        with self._BmpParse(self.filename,fast) as bmp_file:
            if not fast:
                bmp_file.seek(self.data)
            disp.set_window(x, y, self.width, self.height)
            line_start = 0
            line_end = line_size
            for line in range(self.height):
                current_line_data = b""
                if fast:
                    line_start = line * line_size
                    line_end = line_start + line_size
                else:
                    pixel_data = bmp_file.read(line_size)

                for i in range(line_start, line_end, self.bpp // 8):
                    if (line_end - i) < self.bpp // 8:
                        break
                    if self.bpp == 16:
                        color = self.convert_555_to_565(
                            pixel_data[i] | pixel_data[i + 1] << 8
                        )
                    if self.bpp in (24, 32):
                        color = self.color565(
                            pixel_data[i + 2], pixel_data[i + 1], pixel_data[i]
                        )
                    current_line_data = current_line_data + struct.pack(">H", color)
                disp.setxy(x, self.height - line + y)
                disp.push_pixels(current_line_data)
            disp.set_window(0, 0, disp.width, disp.height)

    @staticmethod
    def convert_555_to_565(color_555):
        """Convert 16-bit color from 5-5-5 to 5-6-5 format"""
        r = (color_555 & 0x1F) << 3
        g = ((color_555 >> 5) & 0x1F) << 2
        b = ((color_555 >> 10) & 0x1F) << 3
        return (r << 11) | (g << 5) | b

    @staticmethod
    def color565(r, g, b):
        """Convert 24-bit RGB color to 16-bit color (5-6-5 format)"""
        return ((r & 0xF8) << 8) | ((g & 0xFC) << 3) | (b >> 3)

Edit: I was scanning the code and realized that the internal class _BmpParse was going to have a problem in "Fast" mode. I've edited the exit so I think it will work now.

DJDevon3 · 2024-04-30T03:31:03Z

@RetiredWizard The original demo was designed for a Feather M4 so yes it can work. The M0 will struggle but it's doable. At the time this library was coded the M4 was probably top of the line with Circuit Python (other than Teensy). S2 and S3 didn't even exist yet.

Now that we have more powerful microcontrollers my thought is that it's time to revisit these larger displays. They are definitely usable on an S3. If I can find a way to combine the native RA8875 register methods with images and layers it's possible it could be faster than displays half its size due to the hardware acceleration.

The hardware acceleration is almost instantaneously updated for vector graphics or text using the native methods... easily 60fps. Under the right conditions, hooking into hardware acceleration, the S3 can drive vectors on this display faster than displayio can on a 128x64 OLED.

Melissa laid the ground work on this display driver but it hasn't been revisited in years. One of the things that made me decide to try it now is the port of the 7" Sunton ESP32-S3 board that uses a similar display which was recently added to the supported boards.

I like the changes you made and will likely do another commit to fold that in so that the M4 can still work with it. I didn't think about the line chunks and ram use, you're absolutely right and the old method should be left in. Thank you for doing that!

RetiredWizard · 2024-04-30T04:47:23Z

Thanks! I'm sort of curious if the private class I created to handle the with context manager is going to work as I expect. I did try and do some testing (although without the hardware I don't get far) but the testing did find that I left off the "self." in the with self._BmpParse(self.filename,fast) as bmp_file: line so I edited the code above again to fix that as well.

DJDevon3 · 2024-05-01T15:56:32Z

After playing around with it for a bit figured out that it comes down to moving the function into the library and giving it a method for self.color565. Without the self it reaches outside to the top of the script for color conversion. I don't think there is a RAM difference though I have no idea how I'd test that.

The one line for color = color565 vs color = self.color565 makes all the difference in the world. I tried going back and making it work in the example with the embedded BMP class but it was throwing fits. This is better and I bet it will work better even on the M4 too. So this should be a real overall speed improvement regardless of the board though I haven't tested it on anything other than an S3.

Unfortunately the speed improvement only really works with the BMP test (rasterized). All of the native register methods with vector are already as fast as they can be.

@RetiredWizard I tried getting your code to work but couldn't. So I went the long way and added an argument to the function with an if/else to test the new method and old method. When adding self as described above the old method instantly becomes 4 times faster. There's almost no difference in using the for loop vs the with statement. It's the way it was reaching farther for color conversion that was keeping it slow. The way you wrote your context manager is pretty close to the way I was also testing it.

It's worth noting that the sheer amount of disabled pylint errors in this library is also what hid the issue from being discovered... specifically the ignore invalid names. They were ignored because there are too many variables named, x1, x2, x3, y1, y2, y3, etc... and it would be hours of work to rename them all properly... but it also ignored an actual error for color565 that would have lead to the optimization as a natural course of making pylint happy.

DJDevon3 · 2024-05-01T18:49:45Z

No idea how that went unnoticed for so long. The blue and green RGB values were incorrect in the simpletest.

RetiredWizard · 2024-05-02T03:23:04Z

When adding self as described above the old method instantly becomes 4 times faster. There's almost no difference in using the for loop vs the with statement. It's the way it was reaching farther for color conversion that was keeping it slow.

Excellent sleuthing 😁 I'm surprised that there's apparently no speed advantage to reading the BMP file with a single read statement rather than one line at a time. I still think reading the entire BMP file into a variable could be a memory problem depending on the size of the image and/or the available resources on the board being used. But maybe that's a problem to be resolved when it's encountered.

I don't think there is a RAM difference though I have no idea how I'd test that.

Have you tried printing gc.mem_free() at the start and end of your test programs? I don't really understand the under the hood workings of CircuitPython memory but mem_free() is what I use to get a general feel for how much memory my programs are using.

DJDevon3 · 2024-05-02T08:25:12Z

@RetiredWizard I did play around with attempting to feed it more bmp read lines in chunks and it was actually slower so there's no additional chunking improvements that can be done. The speed of the color conversion is the limiting factor. I have an idea to attempt to use the native vector pixel/line to read a bmp pixel by pixel and display it. No clue if that will work but it might be a way to tie into the native method for hardware accelerated vector to convert a rasterized image to vector... in theory.

The hardware accelerators are all vector based including the built-in fonts. They work about 100x faster so if I can fool it into building a rasterized image using its native vector methods... we'll see.

The startup time isn't as fast but the draw time is. Allows BMP class and other file types in the future to be expanded upon more easily. Updated BMPtest example with the new imports and function calls.

DJDevon3 · 2024-05-02T10:36:22Z

Since pylint doesn't like any file over 1000 lines we'll just move the entire BMP class to its own file and import that. Have confirmed it works as intended with the new changes. It also makes it easier to expand upon the class and add other class file types in the future like jpeg, gif, etc..

DJDevon3 · 2024-05-02T23:58:45Z

Figured out how to read a single pixel and will push a PR for it shortly. I wrote a playground note on it here.

For a screensaver

ability to read a pixel color from the display at x,y coordinates and return as RGB values. Read data16 function makes it easier to chunk 16-bit color to/from register. Added some missing registers and register descriptions.

set debug default to false

DJDevon3 · 2024-06-02T12:59:10Z

Added read single pixel test. Reads 12 pixels from display and replicates the colors with filled rectangles. This is an example of a good test.

BMP speed improvement

46f2fab

Move class from example to library.

corrected Blue & Green RGB values

d303657

moved BMP class to separate file

437c77c

The startup time isn't as fast but the draw time is. Allows BMP class and other file types in the future to be expanded upon more easily. Updated BMPtest example with the new imports and function calls.

DJDevon3 added 3 commits May 6, 2024 23:55

infinite pixel screensaver example

090468c

For a screensaver

add read single pixel function and colorchart test script

457bae4

ability to read a pixel color from the display at x,y coordinates and return as RGB values. Read data16 function makes it easier to chunk 16-bit color to/from register. Added some missing registers and register descriptions.

missed setting a debug false by default

9f3b0c7

set debug default to false

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BMP 4x loading speed improvement #31

BMP 4x loading speed improvement #31

DJDevon3 commented Apr 29, 2024 •

edited

Loading

RetiredWizard commented Apr 29, 2024 •

edited

Loading

DJDevon3 commented Apr 30, 2024 •

edited

Loading

RetiredWizard commented Apr 30, 2024

DJDevon3 commented May 1, 2024 •

edited

Loading

DJDevon3 commented May 1, 2024

RetiredWizard commented May 2, 2024

DJDevon3 commented May 2, 2024

DJDevon3 commented May 2, 2024 •

edited

Loading

DJDevon3 commented May 2, 2024 •

edited

Loading

DJDevon3 commented Jun 2, 2024

BMP 4x loading speed improvement #31

Are you sure you want to change the base?

BMP 4x loading speed improvement #31

Conversation

DJDevon3 commented Apr 29, 2024 • edited Loading

RetiredWizard commented Apr 29, 2024 • edited Loading

DJDevon3 commented Apr 30, 2024 • edited Loading

RetiredWizard commented Apr 30, 2024

DJDevon3 commented May 1, 2024 • edited Loading

DJDevon3 commented May 1, 2024

RetiredWizard commented May 2, 2024

DJDevon3 commented May 2, 2024

DJDevon3 commented May 2, 2024 • edited Loading

DJDevon3 commented May 2, 2024 • edited Loading

DJDevon3 commented Jun 2, 2024

DJDevon3 commented Apr 29, 2024 •

edited

Loading

RetiredWizard commented Apr 29, 2024 •

edited

Loading

DJDevon3 commented Apr 30, 2024 •

edited

Loading

DJDevon3 commented May 1, 2024 •

edited

Loading

DJDevon3 commented May 2, 2024 •

edited

Loading

DJDevon3 commented May 2, 2024 •

edited

Loading