To extract data from PSI trees of Java project via psiminer, use this command
$ python -m scripts.prepocess <path-to-project-folder>
Resulting .c2s file with samples and vocabulary table are written to the datasets
directory, the name of
subdirectory name is same to the project's.
You can change extraction parameters (e.g., type preserving) by
modifying configs/psiminer_config.json
.
To calculate quality metrics on a preprocessed project via code2seq, use this command
$ python -m scripts.test_single <path-to-preprocessed-project> <path-to-model-checkpoint>
Function test_single
returns the list of calculated metrics
$ python -m scripts.test_single <path-to-preprocessed-raw-Java-dataset> <path-to-model-checkpoint> <path-to-results-storage-folder>
Automatically does preprocessing and evaluating, produces results.csv
file, where all metrics and project names
stored with header.