SRE Version 4: Now in TypeScript
This is the full conversion of Speech Rule Engine to Typescript with a partial re-implementation of some of the features like rule and locale handling. Below is a (very likely incomplete) list of all the changes. Please see the acknowledgements after the TLDR.
TLDR
- SRE moves to ES6 using TypeScript and webpack:
- API now uses promise for engine setup
- Single bundle file for both node and browser
- Support for alternative bundlers
- New locales for Norwegian (Bokmal and Nynorsk), Swedish, and Catalan
- Support for two dimensional formula layout in Nemeth Braille
- Major rewrite of rule handling
- Smaller memory footprint of indexed rules
- Smaller locale files
- All localisation now in a dedicated repository
sre-l10n
- Bespoke YAML format for speech rules for easier translation
- CrowdIn support for simple message translations
- New API methods for generating word representations of numbers, ordinals, and vulgar fractions
- Internet Explorer support deprecated
Acknowledgements
- NumFocus for a "Small Development Grant" that financed a one week sprint to make the initial conversion to TypeScript possible
- TextHelp for their support on
- refactoring the rule engine, redesigning the rule format and CrowdIn integration
- localisations into Nordic languages
- Statistical Institute of Catalonia (Idescat) for providing the Catalan translations
- American Action Fund for supporting the ongoing Nemeth work.
- MathJax for their continuing support of the system.
TypeScript Conversion
- The entire code base has been converted from JavaScript in Google Closure Syntax to TypeScript based on ES6 standard.
- Bundling is done using webpack.
- Code is fully cycle free and easily usable with other bundlers.
- Support for some alternative bundlers has been added (
rollup
andeslint
). For more information see the README. - Code is formatted with prettier
- Code is linted with eslint
- All code adheres to the latest JSDOC documentation conventions.
Code Structure and Building
-
Sources are in
ts
directory. -
The
src
directory and all legacy JavaScript code has been removed. -
Building
sre
is now done withnpx tsc; npx webpack
-
The bundled file is in
lib/sre.js
. It works both in node and in a browser. Simply include the filesre.js
in your website in a script tag. -
Consequently the
sre_browser.js
bundle no longer exists. -
Transpiled Javascript files are in the
js
directory, which is created on the fly. -
Script
sre4node.js
has been removed as SRE libraries can be loaded from thejs
directory directly. -
The
Makefile
is now exclusively for building the unicode mapping files, one per locale. -
The
mathmaps
subdirectory with locale sources has been moved to the top level.
npm Package
Structure of the package remains nearly unchanged with two exceptions:
lib/sre_browser.js
is no longer available.js
directory with JavaScript files is contained in the distribution for easier integration into third party projects.
Mathmaps Directory
- Unicode mappings are again in files with an
.json
extension. - Likewise compiled locale mappings are in a single
.json
file inlib/mathmaps
. - Use
make all
to create themathmaps
. - JSON minimization is done via an intermediate step to generate
.min
files, which is handles in theMakefile
and ensures that only newly altered files have to be minimized. - Each locale now contains a
messages
subdirectory. These contains messages used for generating alphabets, font names etc. Note, that in the combined, minified version of the locale.json
, messages always need to come first. - Each locale now also contains a
rules
subdirectory. These contains the speech rule sets.
Locales and Rule Handling
New Locales
- Norwegian (Nynorsk and Bokmal) support for all rule sets
- Swedish support for all rule sets. Still experimental.
- Cataln support for all rule sets except Clearspeak
2D Nemeth output
- Nemeth support for 2D dimensional layout based on the existing Nemeth rule set handled via a new
layout
renderer. - Add to
setupEngine
:
markup: 'layout'
- Or run on the command line with
./bin/sre -b braille -c nemeth -k layout
New Speech Rule Handling
- Introduces inheritance of speech rules from an abstract base locale.
- Speech rules are separated into dedicated precondition and action files. Speech rules are only formed when an action is given of an existing precondition.
- The idea is that effectively only actions have to be localised. While new rules or preconditions can still be added, the majority can be inherited from the common base locale.
- Speech rule sets are minimised as much as possible. However locales can alter rules by adding new preconditions, ignoring existing one or overwriting base actions with localised ones.
- Reduction of size for locale files.
- Smaller Memory footprint for indexed rules.
New Localisation Support
- All localisation now in dedicate repository
sre-l10n
- Dedicated YAML format for speech rule actions.
- Support for CrowdIn Localisation of symbols, functions and units.
- Automatic update of speech rules from the
sre-l10n
repository. - No more changes to locales in this repository
API Changes and Additions
Promise based processing
The functionality for loading locales and updating the engine have been refactored to use ES6 promises. This changes the asynchronous behaviour of the engine, which client code will have to take into account.
Changes to Setup Functions and File Loading
- In particular the changes to the following API method in
system.ts
:engineReady()
returns a Promise that resolves as soon as the engine is ready for processing.setupEngine()
is now an asynchronous function that returning a Promise that resolves as soon as the engine is ready for processing.- Other methods that return promises are the file loading methods
file.toSpeech
,file.toSemantic
, ...
- The engine is considered ready for processing, when all necessary rule files have been loaded for the current locale and the engine is done updating other internals, like the rule indexing structure, the constraint structures, etc.
Custom Load methods
- Allows to specify a custom method for loading locales.
- Custom load method can be passed to the feature vector in
setupEngine
. - In the browser in can be defined in the
SREfeature
variable.
For more information see the README.
New API Functions and Features
- Four new API functions for Translation of numbers, ordinals and vulgar fractions to word representation in respective locales.
- Sub locales are exposed (e.g., for different reading of numbers in the same language).
- New corresponding options for CLI frontend.
Revamping the Rule Engine
Changes to data structures, speech rules and code structure.
Simple Speech Rules
Simple speech rules for unicode symbols, functions and units are now handled separately from regular speech rules. That is, the data structure MathSimpleStore
no longer inherits from MathStore
. While this requires some additional logic for parsing, looking up, and selecting simple rules it reduces the memory footprint of functionality never required by simple rule stores.
Speech Rule Stores
Classes with interface SpeechRuleStore
no longer have a trie for indexing speech rules. They are exclusively a container for storing rules together with a common context. Rule look up can only be done via findRule
.
In particular stores do no longer provide a lookupRule
method that matches rule applicability with respect to a given DOM node.
Rule lookup is not done on tries only.
Speech Rule Engine
The core engine no longer uses a SpeechRuleStore
to lookup rules. Previously an active store would have been selected or constructed as a combination of stores, that the store's trie would be use for looking up rules. Now the engine uses a single trie only.
Speech rules are immediately sorted into the trie on load of a locale. While the trie can still be pruned, there is no longer any combining of rule stores (into the active store) or reindexing of tries. The rule engine can therefore no longer be furnished with a selection of speech rules stores, only. It will always work with all rules of all locales currently loaded.
Locale Messages
Code for locales that was included in the compiled version has been considerably reduced. Only methods for number string generation and alphabet combinations remain. The latter are often shared between multiple locales. Consequently the growth of the size of sre.js
should be small when adding new locales.
Messages for locales have been refactored into three categories, included in a new messages
subdirectory and in the locale JSON structures:
alphabets
: Strings for Greek and Latin alphabets and corresponding prefixes.messages
: Messages for MathSpeak, fonts, embellishments, roles, etc.numbers
: Strings necessary for generating numbers.
Bug fixes
Version 4 contains a number of bug fixes, some introduced during the conversion. However, the issue tracker has been very much neglected during the conversion period. Hopefully this will change in the near future.
Deprecation Notes
- The deprecated
-i
option has been removed. - The
sre_browser.js
library is no longer necessary and no longer created. - Support for Internet Explorer has been removed. That is, the IE mappings file at
npm
repository will no longer be updated. You can still use it, but it will not get any new locale updates and might stop working in some future release.
This will be the last release in this repository. It will move to the Spech-Rule-Engine organisation in the future.