You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a .trim() call, but this only trims whitespace. If the separator is non-whitespace (e.g. underscore, dash, period, bar, etc.), then trim doesn't work.
Also, Apache Commons has a StringUtils that works very nicely. Perhaps we want to remove this in favor of that.
The text was updated successfully, but these errors were encountered:
Currently there are some feature generators in Edison that calls the StringUtils.join() to generate feature, which includes an extra copy of dash in the feature name. Example below.
Switching to Apache Commons will change the feature names, and so break models that relies on Edison (PrepSRL, I think). What's the best thing to do here? @mssammon
probably, requires retraining. Create a branch "requires_retrain" and make the change on this branch. Once implemented, create a new issue for prepsrl retraining and refer to it in the branch notes. presumably, it is sufficient to retrain prepsrl models and deploy, then update the dependencies on the branch; at that point, it can be merged. (You are not obligated to take that new prepsrl issue, but can if you want.)
cogcomp-nlp/core-utilities/src/main/java/edu/illinois/cs/cogcomp/core/utilities/StringUtils.java
Line 60 in ce0f3a0
There is a .trim() call, but this only trims whitespace. If the separator is non-whitespace (e.g. underscore, dash, period, bar, etc.), then trim doesn't work.
Also, Apache Commons has a StringUtils that works very nicely. Perhaps we want to remove this in favor of that.
The text was updated successfully, but these errors were encountered: