您好,登錄后才能下訂單哦!
這篇文章主要介紹“Solr4.7的synonyms怎么配置”,在日常操作中,相信很多人在Solr4.7的synonyms怎么配置問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對大家解答”Solr4.7的synonyms怎么配置”的疑惑有所幫助!接下來,請跟著小編一起來學習吧!
在搜索中,往往需要用到關聯詞(近義詞),比如,搜索 “聯想” 品牌那么我們同時搜索 “lenovo”等,solr為我們提供了近義詞過濾器solr.SynonymFilterFactory。
配置搜索近義詞很簡單,只要在schema字段定義過濾器
在schema.xml的<types>標簽中添加<fieldType>,如下:
<!-- IK中文分詞器,停用詞,同義詞配置 --> <fieldType name="text_ik" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType>
solr.SynonymFilterFactory配置中,synonyms是近義詞配置文件
ignoreCase:為true,表示轉化為小寫匹配,及忽略大小寫。
expand:涉及到synonyms.txt的配置
synonyms.txt配置一行為單位,建立關鍵詞聯系
# The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. #----------------------------------------------------------------------- #some test synonym mappings unlikely to appear in real input text aaafoo => aaabar bbbfoo => bbbfoo bbbbar cccfoo => cccbar cccbaz fooaaa,baraaa,bazaaa # Some synonym groups specific to this example GB,gib,gigabyte,gigabytes MB,mib,megabyte,megabytes Television, Televisions, TV, TVs #notice we use "gib" instead of "GiB" so any WordDelimiterFilter coming #after us won't split it into two words. 中國,英國,日本 # Synonym mappings can be used for spelling correction too pixima => pixma
就是說=>指一對一,以逗號分隔的是組群,也就是多對多。
當然這個還得定義相關字段為這個類型,如下。
<field name="msg_title" type="text_ik" indexed="true" stored="true"/>
到此,關于“Solr4.7的synonyms怎么配置”的學習就結束了,希望能夠解決大家的疑惑。理論與實踐的搭配能更好的幫助大家學習,快去試試吧!若想繼續學習更多相關知識,請繼續關注億速云網站,小編會繼續努力為大家帶來更多實用的文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。