This is the first official release of our library for training and testing open-vocabulary neural language models for code. Closed-vocabulary models are also supported.
The library supports entropy calculation, measuring code completion performance on a file / test dataset, dynamic updates, dynamic code completion, measuring identifier completion performance, and an optional identifier cache.
The implementation was used for the experiments in the paper: Big Code != Big Vocabulary:Open-Vocabulary Models for Source Code