-
-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move schemes into separate JSON files #13
Conversation
Hey @psvenk, this is an interesting idea! Personally, I don't think we gain much by using the TOML format over the JSON format in this specific case, but of course, it's basically the same. We do avoid adding a dependency by using JSON, but, if we're already adding a build step, adding such a well-tested dependency is arguably a non-issue. If you're going to introduce a build step, why not go all the way and split each language out into its own file? Regardless of whether we stick with JSON or TOML, we could define a Sanscript Scheme schema using something like So basically:
-- OR --
We could even have a GitHub action handle this for us. |
@rramphal Thanks for the speedy response!
I agree that there is not much to gain in this case, especially with splitting each scheme into its own file (in which case the files consist of only keys mapping to arrays). JSON does have the benefit of being more widely used in the JS world too; maybe we could introduce a description field containing what is currently in comments (possibly along with a way for client code to display the description of a scheme).
Thanks for pointing this out; I'll do this in a commit soon (in JSON).
This is a great idea. We could also add to the schema some properties which are currently implemented as special cases in the code. The fact that a somewhat mature schema representation language is available for JSON but not for TOML leads me to believe that we should use JSON for storing data on the scripts/romanizations for consistency. I also like the idea of verifying this with CI. |
Good - I like the general idea as well as the idea of splitting into separate json/toml files. If we're ditching toml because schema specification and validation is simpler elsewhere, might as well pick https://json5.org/ . Action item for later:
|
111ef31
to
08a42b2
Compare
This has the benefits of separating data from code and of having a language-agnostic specification of schemes (for implementations of Sanscript in other programming languages). This also adds a build step integrating the scheme data into the JS file -- the file that now contains only the code has been moved to a subdirectory and the build step (npm run dist) outputs to sanscript.js (so this causes no breaking changes).
08a42b2
to
49fa16b
Compare
I have switched to JSON, split @vvasuki Thank you very much for your response.
I did a little bit of looking around and it seems that JSON5 is quite JavaScript-specific and that implementations of JSON Schema in other languages generally do not support JSON5, so this would mean introducing a Node.js dependency to other implementations of Sanscript in order to convert the JSON5 to JSON. So, if I understand correctly, JSON5 would be simpler to use than TOML in the case of JavaScript (because
Sounds good! |
Ok - sent you an invite to be a maintainer. Once ready and @rramphal takes a look, merge into master and let me know so that I can update npm. |
Solid work, @psvenk! I'd like to suggest some improvements, but they are all pretty much outside the scope of this PR. I can get to them in a couple of hours. After @psvenk merges this into master, @vvasuki, if you can hold off on publishing to npm, I'd be glad to submit a PR with some ideas for improvements if that sounds good. |
Great. I already have a schema done, so I can open a separate PR after merging this to gather feedback about the schema. The schema was written so that the current schemes pass validation, but some changes to the scheme format and the logic would be necessary to make the schema simpler (e.g. replace singleton arrays with simple strings). Additionally, some hard-coded rules (e.g. Brahmic vs. roman and the definition of the Kolkata scheme) should be moved out of the logic and made into parameters in the schema or moved into |
@psvenk Awesome. I think getting a validator in to represent the current schema for now, and then following up with a PR later to move some of the logic over sounds like a good plan. |
This has the benefits of separating data from code and of having a language-agnostic specification of schemes (for implementations of Sanscript in other programming languages). This also adds a build step integrating the scheme data into the JS file -- the file that now contains only the code has been moved to a subdirectory and the build step (npm run dist) outputs to sanscript.js (so this causes no breaking changes).
This has the benefits of separating data from code and of having a language-agnostic specification of schemes (for implementations of Sanscript in other programming languages).
This also adds a build step integrating the scheme data into the JS file — the file that now contains only the code has been moved to a subdirectory and the build step (
npm run dist
) outputs tosanscript.js
(so this causes no breaking changes). There is also an additional development dependency of@iarna/toml
.I chose TOML over JSON and other formats because it supports comments and it can represent the scheme data in a simple and readable format without any indentation. I generated the TOML file using a script taking the initial declaration of
Sanscript.schemes
and serializing it using the TOML library, followed by adjusting the formatting and adding in the comments from the JS file. Having a separate TOML file actually improves JSON compatibility because the following Node.js script converts the TOML file to valid JSON (automatically stripping comments):Code
I have marked this pull request as "draft" because there is some refactoring to be done before the TOML file is truly portable between programming languages (e.g. some encodings, such as Kolkata, are defined in code), but I wanted to get your thoughts on the idea before proceeding with that.
All tests (35 tests; 780 assertions) are passing when I open
tests/index.html
in my browser.