首页 > 英文翻译

[9]es数据批量更新及数据导入导出

更新时间:2023-07-18 08:07:45 阅读：评论：0

[9]es数据批量更新及数据导⼊导出

本⽂集主要是总结⾃⼰在项⽬中使⽤ES 的经验教训，包括各种实战和调优。

主要借助了es的scroll完成对要更新数据的备份，然后利⽤bulk完成对数据的批量更新。

⾃⼰写的代码(参考官⽹，bulk的写法和官⽹的不太⼀样)：

/**

* 批量更新elasticarch数据，⽬前仅⽀持更新int字段或将某⼀field字段的值赋给另外的字段。

* @param index

* @param type

全国新概念作文大赛

* @param size bulk批量写⼊⼀次写⼊的数据

天人合一英文* @param routing 没有设置routing，为null或""就⾏

欧美经典电影推荐

* @param querymios

* @param updateField 需要更新的字段

* @param updateValue 更新值，可以为int或mapping中的⼀个字段。

* @return

private String archElasticarchDataByScroll(String index, String type, int size, String routing, QueryBuilder query, String updateField, String updateValue) { Map<String, Object> hitMap;

SearchRespon respon = client.prepareSearch(index)wipe

.tTypes(type)

牛津少儿英语.tQuery(query)

.tScroll(new TimeValue(6000))

.tSize(size).execute().actionGet();

if (size < 500) {

log.warn("bulk size is small,you'd better t size near 1000 ,size:" + size);

}

tpp是什么意思BulkRequestBuilder bulkRequest = client.prepareBulk();

untilnowdo {

for (SearchHit archHit : Hits().getHits()) {

hitMap = Source();

if (!ainsKey(updateField)) {

<("updateField is not exist,updateField:" + updateField);

break;

关于scroll的使⽤的⼀些注意事项：

官⽹⽂档：

mr know it all按照docid进⾏排序的scroll性能会更好。成都java培训

Scroll requests have optimizations that make them faster when the sort order is _doc. If you want to iterate over all documents regardless of the order, this is the most efficient option:

Sliced Scroll：

For scroll queries that return a lot of documents it is possible to split the scroll in multiple slices which can be consumed independently。

使⽤分⽚scroll的数量不能⼤于集群的分⽚数。官⽹上也介绍了根据uid hash的算法，当然也可以⾃⼰选择进⾏hash的字段，但是有⼀定的要求。使⽤sliced scroll可以加快scroll的处理速度。

本文发布于:2023-07-18 08:07:45，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/90/180969.html

上一篇：BULK GAS

下一篇：flink写入elasticarch报错！OOM内存溢出！连接异常关闭！

标签：数据批量概念经验教训电影

留言与评论（共有 0 条评论）