Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git respository has grown to 1.24 GiB #343

Open
Wilfred opened this issue Aug 24, 2022 · 7 comments
Open

Git respository has grown to 1.24 GiB #343

Wilfred opened this issue Aug 24, 2022 · 7 comments

Comments

@Wilfred
Copy link
Owner

Wilfred commented Aug 24, 2022

This is too much. It makes CI slower and contributing slower.

The git subtrees are getting too big, we might have to rewrite history to use snapshots of vendored parsers.

Wilfred added a commit that referenced this issue Aug 27, 2022
This is already in the main branch and it makes the repository bigger
than it needs to be.

Should slightly improve #343
@filmor
Copy link
Contributor

filmor commented Sep 10, 2022

Looking at the objects, the main culprit are the precompiled vendor/*/src/parser.c files. I doubt that this can be fixed without excluding these and either generating them at build-time with tree-sitter generate from the grammar or just not vendoring them at all. Quite a few grammars are available on crates.io already.

@Xuanwo
Copy link
Contributor

Xuanwo commented Sep 12, 2022

I don't know if releasing sub-crates like difftastic-language-xxx is a good idea.

@filmor
Copy link
Contributor

filmor commented Sep 12, 2022

I don't know if releasing sub-crates like difftastic-language-xxx is a good idea.

That's not what I meant. There are quite a few tree-sitter-* crates that one could depend on instead of vendoring them.

@Wilfred
Copy link
Owner Author

Wilfred commented Sep 13, 2022

The majority of parsers in difftastic are either not available on crates.io, or the versions on crates.io are old.

I agree that the vendor/*/src/parser.c files are the biggest, and the SQL parser is particularly big: m-novikov/tree-sitter-sql#59

If difftastic just had a snapshot of each parser, it wouldn't have the history of these large files, substantially reducing the size.

Alternatively, maybe it would make sense to look at creating the parser.c files during the build too. This would enable usage of the new, faster ABI tree-sitter/tree-sitter#1852 and it's already the case that the Swift parser doesn't have parser.c checked in.

@Xuanwo
Copy link
Contributor

Xuanwo commented Sep 13, 2022

Alternatively, maybe it would make sense to look at creating the parser.c files during the build too.

I prefer this way. I'm interested in implementing this, any notes for me?

@nogweii
Copy link

nogweii commented Oct 12, 2022

I think dynamically loading the parsers is the way forward: #356 & #123

@cglong
Copy link

cglong commented Feb 26, 2023

Could Git submodules be used here? That way, you could link to a specific version of each dependency without embedding it directly into the repo.

hugo-vrijswijk pushed a commit to hugo-vrijswijk/difftastic that referenced this issue Mar 14, 2024
chore: generate and sync latest changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants