Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: Refactoring Code-class and new Custom pygments based formatter #3905

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

OliverStrait
Copy link

Overview: What does this pull request change?

  • Refactoring Code class
  • Added new Formatter class CodeColorFormatter
  • Added testes and new utility-file to test reading and formating from code file.

Motivation and Explanation: Why and how do your changes improve the library?

  • Old Code parser is very convoluted and hard to understand code of custom python parser with poor naming, and jumpping and overlapping procedure. It used pygments HtmlFormatter as backend and after parsed html to list[tuple[str,str]] structure which is used for coloring.
  • New custom formatter skips html and transforms code directly to list-mapping using pygments lexers and tokenizer functions.
  • Better naming and capsulating responsibility to a logical chuncks.
  • New system is just about 2 to 3 times faster, but Mobject constructors are still performance bottlenecks

code_class-new_perf

Links to added or changed documentation pages

Further Information and Comments

  • If Paragraph could support color to text mapping we could remove whole block of re-coloring code after object-construction (little performance boost).
  • Tested with pygments 2.18.0 and python language
    ** There may be some odd cases when pygments Tokenizer sneak newlines to unexpected places. Testes with different syntax and languages would be good to perform.
    ** If uncatched newlines is passed to Paragraph inside of string literals it will broke re-coloring an line-numbers.

Reviewer Checklist

  • The PR title is descriptive enough for the changelog, and the PR is labeled correctly
  • If applicable: newly added non-private functions and classes have a docstring including a short summary and a PARAMETERS section
  • If applicable: newly added functions and classes are tested

manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
manim/mobject/text/code_mobject.py Fixed Show fixed Hide fixed
@JasonGrace2282
Copy link
Member

JasonGrace2282 commented Aug 14, 2024

This seems interesting - Code has been in need of a rework for a while. Hopefully, I'll get the chance to look at this in a bit :)

Oliver Strait added 4 commits August 16, 2024 10:55
* Refactoring whole Code class from old Html- parser to use new custom Formatter
* Cutting Code-class internal attributes on which information comes from user
* Moved Object Constructor methods to separate functions
Depricated:
* Generate_html_file argument won't fit with current formatter.
* Indentation_chars argument, is no more needed
Feature:
* Custom Formatter based on Pygments backend.
* Utility_file to Code-class testing
* Added tests
* Runtime error handling
tested:
* pygments 2.18.0
* Little linter-error fixes
* Little linter-error fixes
* More error fixes
so it can handle more hidden Tokenizer quirks automatically
* Removing newlines from end -> Those causes Paragraph class
to missbehave and enproduct had code line-number miss alignment
@OliverStrait
Copy link
Author

  • I made parser more general to handle odd cases where tokenizer produce string literals with newlines hidden inside.
  • Somehow Paragraph class cannot handle empty newlines at end of string (therefore line numbers will missalign) so I added parser to remove those.
  • (also fixed messed up git history)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

Successfully merging this pull request may close these issues.

2 participants