Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[code2FN] Tree-Sitter to CAST Missing Identifiers #740

Closed
titomeister opened this issue Dec 22, 2023 · 2 comments · Fixed by #745
Closed

[code2FN] Tree-Sitter to CAST Missing Identifiers #740

titomeister opened this issue Dec 22, 2023 · 2 comments · Fixed by #745
Assignees
Labels

Comments

@titomeister
Copy link
Contributor

The Python Tree-Sitter to CAST implementation is missing identifiers for certain elements.
In particular, the very first line of the source file will have its identifiers missing.

We're thinking it's a bug with the get_identifier() function in node_helper.py. Something is off about the indexing.
When we add an additional blank line at the top of the source file, then we get the identifiers back.

The simplest observable example is the code.
x = 2

With no additional blank lines above it.

@titomeister titomeister added this to the [DARPA] Milestone 11 milestone Dec 22, 2023
@titomeister titomeister self-assigned this Dec 22, 2023
@vincentraymond-ua
Copy link
Contributor

This is related to how we are using the line length sums when calculating an identifier on line 0.

There are two changes needed to fix this issue:

  1. Initialize the line length sums List with 0 in index 0.
self.line_length_sums = [0] + list(itertools.accumulate(self.line_lengths))
  1. Update the logic for calculating the start_index.
start_index = self.line_length_sums[start_line] + start_column

@vincentraymond-ua vincentraymond-ua self-assigned this Jan 9, 2024
@myedibleenso myedibleenso changed the title [code2FN] Python Tree-Sitter CAST Missing Identifiers [code2FN] Tree-Sitter to CAST Missing Identifiers Jan 9, 2024
@myedibleenso
Copy link
Collaborator

myedibleenso commented Jan 9, 2024

Needs test (using program in PR description) + PR

github-actions bot added a commit that referenced this issue Jan 12, 2024
This PR introduces support for generating CAST for Loops (for/while)
using tree-sitter, as part of the ongoing effort to port over the Python
AST to CAST generation to using tree-sitter.

### Python Tree Sitter
- Added support for generating CAST Loop nodes using Python tree sitter.
- Specifically, we added support for Python's For and While loop CAST
generation.
- Added some support for tree-sitter "patterns: "list_pattern",
"tuple_pattern", "list_pattern". These are used primarily in the For
loop syntax as the item we're iterating over.
- Also added support for generating Iterator function calls, as used by
the Python For Loops.
- Updated the CAST to AGraph visualizer to better visualize tuples.

### Testing

- Added some small unit tests for loops (for/while) to determine
consistency.
- Added a small unit test for detecting missing identifiers in the first
line of a Python program.

### Other Fixes
- Fixes an issue with missing identifier names when they appeared on the
first line of the Python program. This was done by adding some
additional handling that is specific to the first line of the program.

Resolves #498
Resolves #740 f2b27d4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants