ikanalyzer和lucene整合关于扩展词典
时间: 2016-06-11来源:开源中国
前景提要
HDC调试需求开发(15万预算),能者速来!>>>
ik版本是3.2.0stable.jar lucene版本是3.0.2.jar
----------------IKAnalyzer.cfg.xml的配置内容----------------------------
<properties>
<entry key="ext_dict">/ext_first.dic</entry>
</properties>
------------------测试代码-------------------------------------------------
public static void main(String[] args) throws IOException {
new TestAnalyzer().test(new IKAnalyzer(), "我是个大帅哥,而且很聪明的大帅哥");
}

public void test(Analyzer analyzer,String text) throws IOException{
System.out.println("分词器是:"+analyzer.getClass().getName());
TokenStream tokenStream=analyzer.tokenStream("content", new StringReader(text));
tokenStream.addAttribute(TermAttribute.class);
while(tokenStream.incrementToken()){
TermAttribute termAttribute=tokenStream.getAttribute(TermAttribute.class);
System.out.println(termAttribute.term());
}
}



但是结果报错了:报错信息如下。麻烦大家给点建议:
分词器是:org.wltea.analyzer.lucene.IKAnalyzer
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.wltea.analyzer.seg.ChineseSegmenter.<init>(ChineseSegmenter.java:37)
at org.wltea.analyzer.cfg.Configuration.loadSegmenter(Configuration.java:114)
at org.wltea.analyzer.IKSegmentation.<init>(IKSegmentation.java:54)
at org.wltea.analyzer.lucene.IKTokenizer.<init>(IKTokenizer.java:44)
at org.wltea.analyzer.lucene.IKAnalyzer.tokenStream(IKAnalyzer.java:45)
at cn.gdpe.lucene.TestAnalyzer.test(TestAnalyzer.java:32)
at cn.gdpe.lucene.TestAnalyzer.main(TestAnalyzer.java:27)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at org.wltea.analyzer.dic.DictSegment.fillSegment(DictSegment.java:139)
at org.wltea.analyzer.dic.DictSegment.fillSegment(DictSegment.java:128)
at org.wltea.analyzer.dic.Dictionary.loadMainDict(Dictionary.java:134)
at org.wltea.analyzer.dic.Dictionary.<init>(Dictionary.java:71)
at org.wltea.analyzer.dic.Dictionary.<clinit>(Dictionary.java:41)
... 7 more


谢谢大家了

科技资讯:

科技学院:

科技百科:

科技书籍:

网站大全:

软件大全:

热门排行