Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reload加载错误ArrayIndexOutOfBoundsException #1100

Closed
1 task done
cy576013581 opened this issue Feb 21, 2019 · 1 comment
Closed
1 task done

reload加载错误ArrayIndexOutOfBoundsException #1100

cy576013581 opened this issue Feb 21, 2019 · 1 comment

Comments

@cy576013581
Copy link

cy576013581 commented Feb 21, 2019

注意事项

请确认下列注意事项:

  • 我已仔细阅读下列文档,都没有找到答案:
  • 我已经通过Googleissue区检索功能搜索了我的问题,也没有找到答案。
  • 我明白开源社区是出于兴趣爱好聚集起来的自由社区,不承担任何责任或义务。我会礼貌发言,向每一个帮助我的人表示感谢。
  • 我在此括号内输入x打钩,代表上述事项确认完毕

版本号

当前最新版本号是:1.7.1
我使用的版本是:1.7.1

我的问题

在词库进行reload操作时出现数组溢出,在1.7.0版本中没有该问题。
是否和1.7.1修改reload生成自定义词典bin文件在重新载入时抛出异常ArrayIndexOutOfBoundsException #1028该问题有关

复现问题

步骤

`在CustomDictionary类的loadMainDictionary方法中dat.build(map);时发生异常。
经过跟踪发现在DoubleArrayTrie的

   private int resize(int newSize)
    {
        int[] base2 = new int[newSize];
        int[] check2 = new int[newSize];
        if (allocSize > 0)
        {
            System.arraycopy(base, 0, base2, 0, allocSize);
            System.arraycopy(check, 0, check2, 0, allocSize);
        }

        base = base2;
        check = check2;

        return allocSize = newSize;
    }

在System.arraycopy(base, 0, base2, 0, allocSize);中base做为拷贝源数据长度为1426181
base2的长度为2097152,而allocSize复制长度也为2097152,导致数据溢出

触发代码

   HanLP.segment("自然语言处理");
   CustomDictionary.reload();

期望输出

正常重载

实际输出

警告: 自定义词典./work/data/dictionary/custom/CustomDictionary.txt缓存失败!
java.lang.ArrayIndexOutOfBoundsException: arraycopy: last source index 2097152 out of bounds for int[1426181]
	at java.base/java.lang.System.arraycopy(Native Method)
	at com.hankcs.hanlp.collection.trie.DoubleArrayTrie.resize(DoubleArrayTrie.java:94)
	at com.hankcs.hanlp.collection.trie.DoubleArrayTrie.build(DoubleArrayTrie.java:403)
	at com.hankcs.hanlp.collection.trie.DoubleArrayTrie.build(DoubleArrayTrie.java:338)
	at com.hankcs.hanlp.collection.trie.DoubleArrayTrie.build(DoubleArrayTrie.java:365)
	at com.hankcs.hanlp.collection.trie.DoubleArrayTrie.build(DoubleArrayTrie.java:378)
	at com.hankcs.hanlp.dictionary.CustomDictionary.loadMainDictionary(CustomDictionary.java:107)
	at com.hankcs.hanlp.dictionary.CustomDictionary.loadMainDictionary(CustomDictionary.java:157)
	at com.hankcs.hanlp.dictionary.CustomDictionary.reload(CustomDictionary.java:657)
	at com.gildata.removal.Test_Hanlp.main(Test_Hanlp.java:21)

其他信息

希望能够指教一下

@hankcs
Copy link
Owner

hankcs commented Feb 22, 2019

感谢反馈,已经修复,请参考上面的commit。
如果还有问题,欢迎重开issue。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants