Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Using jieba to extact input tags,before best_match.
For example, suppose there are 10000+ corpus in mongo, while user_input is "where is the apple?"
Before:
The best_match logic get all the corpus in mongo (distinct by in_reponse.text) , the size is about 10000 .Then we compare all this corpus with input. This takes a lot time.
After:
This pull extracts tags of the input. for this example,tags are maybe "apple". Then we add a addtional search option to mongo, using regex ,to select the corpus only related with "apple". the size is more less than 10000 (maybe only 100). Then we compare this 100 corpus with input. This is more fast than before.
Notice:
1.JiaBa required.
2.Best perfomance with Chinese, compatible with english.
3.The first time we load jieba takes about 2s,for each bot.