Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word2vec 模型有1个G,无论设置多少内存给模型,都感觉不是一般的慢,无法使用。 #1304

Closed
yangnianen opened this issue Oct 17, 2019 · 1 comment

Comments

@yangnianen
Copy link

我的word2vec 模型有1个G,发现运行docVectorModel.nearest 方法是在太慢,我应该如何提高模型训练好的,预测速度?现在大于 20秒以上,真痛心,压根没法用呢

@hankcs
Copy link
Owner

hankcs commented Oct 20, 2019

效率瓶颈有两个方面:

  1. 单词查找。你可以传入一个你认为比较快的Map:
    * @param storage 一个空白的Map(HashMap等)
  2. 向量点积。你可以尝试重载
    private List<Map.Entry<K, Float>> nearest(K key, Vector vector, int size)
    用多线程实现点积。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants