es拼音分词大帅哥_elasticarch实现中文分词和拼音分词混合查询+Comple。。。

更新时间:2023-06-26 15:30:54 阅读: 评论:0

es拼⾳分词⼤帅哥_elasticarch实现中⽂分词和拼⾳分词混
合查询+Comple。。。
引⾔
之前已经介绍了如何搭建elasticarch服务端和简单的索引创建,和中⽂分词的⽀持。今天我们来说⼀说如何实现elasticarch同时实现中⽂分词和pinyin分词。并且实现类似百度搜索栏的搜索建议的功能。
混合查询
实现混合查询有很多⽅式,这⾥介绍我认为是⼀个偷懒的⽅法,就是为你要拼⾳搜索的字段提供两个额外的字段,⼀个是全拼字段,⼀个是⾸字母缩写字段。我这⾥⽤的是官⽹的Employee的例⼦:
public class Employee implements Serializable {
private String firstName;
private String lastName;
private String pinyin;//firstName全拼
private String header;//firstName⾸字母缩写
private int age;
private String about;
abcc的词语有哪些private List interests;
....省略getter tter
接下来为index添加tting和mapping
XContentBuilder ttings = XContentFactory.jsonBuilder();
ttings.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("ik_analyzer").field("tokenizer","ik_smart")
.
endObject()
.endObject()
.endObject().endObject();
CreateIndexRequest createIndexRequest = new CreateIndexRequest(index).ttings(ttings);
CreateIndexRespon createIndexRespon = esClient.admin().indices().create(createIndexRequest).get();
logger.info("Index:{} created,respon:{}", index, JSON(createIndexRespon));
XContentBuilder builder = XContentFactory.jsonBuilder();
builder.startObject()
.startObject(type)
.startObject("properties")
.startObject("firstName").field("type", "string").field("analyzer","ik_smart")
/* .field("arch_analyzer","ik_smart").field("prerve_parators",fal) .field("prerve_position_increments",fal)*/ .endObject()
.startObject("lastName").field("type", "string").field("analyzer","ik_smart")
.endObject()
.startObject("pinyin").field("type","string").field("analyzer","pinyin")
.startObject()
.startObject("header").field("type","string").field("analyzer","pinyin")
.startObject("about").field("type", "string").field("analyzer","ik_smart")
.endObject()
.startObject("interests").field("type", "string").field("analyzer","ik_smart")
奇兰茶.endObject()
.endObject()
.endObject()
.endObject();
PutMappingRequest putMappingRequest = new PutMappingRequest(index);
putMappingRequest.source(builder);
PutMappingRespon putMappingRespon = esClient.admin().indices().putMapping(putMappingRequest).get(); logger.info("Mapping for `{}.{}` putted, respon:{}", index, type, JSON(putMappingRespon));
return true;
} catch (Exception e) {
<("doCreateIndex", e);
return fal;
}
添加⼏个测试⽤例,我这⾥直接⽤了批量插⼊索引的⽅法:
public Boolean bulkIndex(List jsonList){
(index)==null) {
if(getMapping(index, indexType)) esIndexTypes.put(index,true);
}
BulkRequestBuilder bulkBuilder= esClient.prepareBulk();
for (String s : jsonList) {
IndexRequestBuilder requestBuilder = esClient.prepareIndex(index, indexType)
.tSource(s);
bulkBuilder.add(requestBuilder);
}
BulkRespon bulkRespon = ute().actionGet();
logger.info("index:{} bulk request,:respon:{}",JSON(bulkRespon)); return true;
}
@org.junit.Test
public void test(){
List list1 = new ArrayList<>(10000);
for (int i=0;i<10000;i++) {
Employee employee = new Employee();
employee.tFirstName("告⽩⽓球"+i);
employee.tPinyin("gaobaiqiqiu"+i);
employee.tHeader("gbqq");
employee.tLastName("周杰伦,⽇记");
产妇刚生完孩子适合吃什么
employee.tAbout("呜啦啦啦⽕车笛\n" +
"\n" +动漫艺术
"随着奔腾的马蹄\n" +
"\n" +
"⼩妹妹吹着⼝琴\n" +
"\n" +
"⼣阳下美了剪影\n" +
"\n" +
"我⽤⼦弹写⽇记,我泡妞看电影");
employee.tAge(18);
List list = new ArrayList();
list.add("喜欢打篮球");
list.add("在⼤晴天晒太阳");
list.add("泡妞看电影");
employee.tInterests(list);
list1.JSONString(employee));
}
boolean index = esProxy.bulkIndex(list1);
}
最后直接搜gaobaiqiqiu或gbqq搜出来的数据像这样:
银耳莲子百合粥
[{"firstName":"告⽩⽓球","lastName":"周杰伦,⽇记","pinyin":"gaobaiqiqiu","about":"呜啦啦啦⽕车笛\n\n随着奔腾的马蹄\n\n⼩妹妹吹着⼝琴\n\n⼣阳下美了剪影\n\n我⽤⼦弹写⽇记,我泡妞看电影","header":"gbqq","interests":["喜欢打篮球","在⼤晴天晒太阳","泡妞看电影"],"age":18}]
如果直接搜告⽩搜出来的数据像这样:
[{"firstName":"告⽩⽓球","lastName":"周杰伦,⽇记","pinyin":"gaobaiqiqiu","about":"呜啦啦啦⽕车笛\n\n随着奔腾的马蹄\n\n⼩妹妹吹着⼝琴\n\n⼣阳下美了剪影\n\n我⽤⼦弹写⽇记,我泡妞看电影","header":"gbqq","interests":["喜欢打篮球","在⼤晴天晒太阳","泡妞看电影"],"age":18}]
CompletionSuggestion查询建议
使⽤CompletionSuggestion时mapping需要改⼀下,实时推荐的字段type需要使⽤completion。
XContentBuilder builder = XContentFactory.jsonBuilder();
builder.startObject()
.startObject(type)
.startObject("properties")
四字词语大全及解释
.startObject("firstName").field("type", "completion").field("analyzer","ik_smart")
.field("arch_analyzer","ik_smart").field("prerve_parators",fal)
.field("prerve_position_increments",fal)
.endObject()
.startObject("lastName").field("type", "string").field("analyzer","ik_smart")
.endObject()
.startObject("pinyin").field("type","string").field("analyzer","pinyin")
.startObject()
.startObject("header").field("type","string").field("analyzer","pinyin")
.startObject("about").field("type", "string").field("analyzer","ik_smart")
.endObject()
.
startObject("interests").field("type", "string").field("analyzer","ik_smart")
.endObject()
我的妈妈.endObject()
.endObject()
.endObject();
查询的时候需要使⽤CompletionSuggestionBuilder.
public void archSuggest(String str){
CompletionSuggestionBuilder suggestionBuilder = new CompletionSuggestionBuilder("firstName");
suggestionBuilder.analyzer("ik_smart");
<(str);
SearchRespon respon = esClient.prepareSearch(index).tTypes(indexType).tQuery(QueryBuilders.matchAllQuery())
.suggest(new SuggestBuilder().addSuggestion("my-suggest-1",suggestionBuilder)).get();
Suggest suggest= Suggest();
CompletionSuggestion suggestion = Suggestion("my-suggest-1");
List list = Entries();
for (int i = 0; i < list.size(); i++) {
List options = (i).getOptions();
for (int j = 0; j < options.size(); j++) {
if ((j) instanceof CompletionSuggestion.Entry.Option) {
CompletionSuggestion.Entry.Option op = (j);
System.out.Score()+"--"+op.getText());
}
}
}
}
{ "size": 0,
引君入瓮"suggest": { "my-suggest-1": { "prefix": "someone li", "completion": { "field": "firstName"}}}}
查询出来的结果:
{
"took": 12,
"timed_out": fal,
"_shards": { "total": 5, "successful": 5, "failed": 0},
"hits": { "total": 0, "max_score": 0, "hits": []},
"suggest": { "blog-suggest": [ { "text": "someone li", "offt": 0, "length": 10, "options": [ { "text": "someone like you", "_index": "megacorp", "_type": "employee", "_id": "AV_doqcXKY206Vs3lcCO", "_score": 1, "_source": { "about": "呜啦啦啦⽕车笛\n\n随着奔腾的马蹄\n\n⼩妹妹吹着⼝琴\n\n⼣阳下美了剪影\n\n我⽤⼦弹写⽇记,我泡妞看电影", "age": 18, "firstName": "someone like you", "interests": [ "喜欢打篮球", "在⼤晴天晒太阳", "泡妞看电影" ], "lastName": "周杰伦,⽇记"}} ]} ]}}

本文发布于:2023-06-26 15:30:54,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/1044672.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:搜索   查询   实现   出来   泡妞
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图