AIWordSplit
代码说明:
直接运行compile.bat和run.bat即可 使用了3种分词方法: 1.正向最大匹配(ForwardMatch.java) 2.逆向最大匹配(ForwardMatch.java) 3.最大频率匹配(FrequencyMatch.java)(默认) 取频率最高的词,然后两端递归,构建二杈树存储句子中的词语,显示的时候使用中序遍历二杈树 由于极有可能单个字的使用频率比整个词还高,筛选的时候进行了处理 若单个字不处于当前句子开头,先忽略, 若单个的字处在开头,临时mostFrequency仍为0,且单词长度为1,则加入到二杈树中(Can be run directly compile.bat and run.bat used three kinds of segmentation methods: 1. Being the biggest match (ForwardMatch.java) 2. Reverse maximum matching (ForwardMatch.java) 3. Maximum frequency matching (FrequencyMatch.java) ( default) to take the highest frequency words, and then both ends of the recursive construct two Cha tree storage sentence terms, when used in the sequence shows two Cha tree traversal is very likely because the use of a single word frequency is higher than the whole word, screening If the time were dealing with a single word is not in the beginning of the current sentence, first ignored, if a single word at the beginning of the temporary mostFrequency still 0, and the word length of 1, to 2 Cha tree)
下载说明:请别用迅雷下载,失败请重下,重下不扣分!