Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MORAE] Make new codebase2amr endpoint that uses the Linespan endpoint #621

Closed
Tracked by #632
Free-Quarks opened this issue Nov 7, 2023 · 0 comments · Fixed by #637
Closed
Tracked by #632

[MORAE] Make new codebase2amr endpoint that uses the Linespan endpoint #621

Free-Quarks opened this issue Nov 7, 2023 · 0 comments · Fixed by #637
Assignees
Labels

Comments

@Free-Quarks
Copy link
Collaborator

This is an additional endpoint, perhaps called llm-assisted-code2amr, which will use the Linespan endpoint we currently have to sub-select the code to only the relevant part for an AMR extraction and send that through the code-snippets-2AMR pipeline.

Some notes:

  • We want this endpoint and the Linespan both available in the unified service. TA-4 is interested in using the linespan endpoint alone as well for sending things to our snippets endpoint with HMI. This is also why the output of the linespan endpoint is what it is, since they already had support for that data structure.
  • The linespan endpoint currently uses GPT3.5 for the extraction. This is temporary until we replace it with our own model that operates on function networks. A downside of the LLM model (besides response time) is that it operates on the source code itself, so it currently only operates on one code file, despite us calling it a codebase2amr, it will only work for one file in the zip until we update the model to our own. I didn't think it was worth the effort to engineer it to handle a codebase of arbitrary size, since we will be replacing it hopefully soon, but that is an option and I thought worth noting.
vincentraymond-ua added a commit that referenced this issue Nov 14, 2023
## Summary of changes
Adds a new workflow endpoint to skema.rest
`llm-assisted-codebase-to-pn-amr` that slices the source code based on
model dynamic linespans determined by an llm. This greatly increases the
accuracy of AMR generation.

Enables support for generating AMR for the CHIME-SIR model, which was
previously failing with the normal `codebase-to-pn-amr` endpoint.

Adds a basic test case for testing CHIME-SIR->AMR generation.

Resolves #621
Resolves #628

---------

Co-authored-by: Justin <[email protected]>
github-actions bot added a commit that referenced this issue Nov 14, 2023
## Summary of changes
Adds a new workflow endpoint to skema.rest
`llm-assisted-codebase-to-pn-amr` that slices the source code based on
model dynamic linespans determined by an llm. This greatly increases the
accuracy of AMR generation.

Enables support for generating AMR for the CHIME-SIR model, which was
previously failing with the normal `codebase-to-pn-amr` endpoint.

Adds a basic test case for testing CHIME-SIR->AMR generation.

Resolves #621
Resolves #628

---------

Co-authored-by: Justin <[email protected]> e740ac1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants