-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster recursive loading of indexed content #61
Conversation
That's great, thanks! I've been a bit swamped with LeanInfer recently and will take a look when I get a chance. |
@CodiumAI-Agent /review |
PR Analysis
PR Feedback
How to use
|
Merging to |
xml_to_index = ( | ||
str(root_dir) + "/" + dep_path_str.replace("/src/", "/lib/") | ||
).replace(".lean", ".trace.xml") | ||
new_traced_file = TracedFile.from_xml(root_dir, xml_to_index, repo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this new_traced_file
depends on another file? Maybe we should add it to traced_files
to make sure it is processed recursively?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct that it alters the returning dependency tree. I think we should change it and add the new_traced_file into the for loop. Good catch.
But at the same time, I am still wondering at which point we really need second-order dependencies. If I want to prove a theorem 2, that uses theorem 1, I am not interested in the premises of theorem 1, right? However, I don't know the code well enough to decide whether all relevant information can be retrieved from first-order dependencies.
@josojo Please see the comment in the code. Since it has been merged into |
Currently, loading traced repos ( by running TracedRepo.load_from_disk) of simple repos like the https://github.com/yangky11/lean4-example takes quite some time on my machine.
I reduced the time by a factor of nearly 10x by loading only the needed traced files. This means we first only read the traced files of the main repo and then only the traced libraries that we really depend on when building the dependency graph.
This PR is quite important for me as it allows me to iterate much quicker on the code, as simple tests take much shorter. If we use the same tracing technique, then the unit tests would run on small machines at reasonable times. But this is for a future PR.
Since this recursive loading is not guaranteed to be faster(it's not so easy to parallelize), I put the recursive loading behind a FLAG.