Reindex
- 新建一个 index ,复制原 index 的 mappings 和 settings
设置新 index 的 refresh_interval 为-1, number_of_replicas 为 0, reindex 效率更高
PUT /twitter/_settings { "index" : { "number_of_replicas" : 0 } } PUT /twitter/_settings { "index" : { "refresh_interval" : "-1" } }
reindex api
Just leaving out version_type (as above) or setting it to internal will cause Elasticsearch to blindly dump documents into the target, overwriting any that happen to have the same type and id:
POST _reindex { "source": { "index": "twitter" }, "dest": { "index": "new_twitter", "version_type": "internal" } }
Setting version_type to external will cause Elasticsearch to preserve the version from the source, create any documents that are missing, and update any documents that have an older version in the destination index than they do in the source index
POST _reindex { "source": { "index": "twitter" }, "dest": { "index": "new_twitter", "version_type": "external" } }
Settings op_type to create will cause _reindex to only create missing documents in the target index. All existing documents will cause a version conflic
POST _reindex { "source": { "index": "twitter" }, "dest": { "index": "new_twitter", "op_type": "create" } }
By default version conflicts abort the _reindex process but you can just count them by settings “conflicts”: “proceed” in the request body
POST _reindex { "conflicts": "proceed", "source": { "index": "twitter" }, "dest": { "index": "new_twitter", "op_type": "create" } }
如果某个字段设置了 completion suggest ,重建索引时或则索引数据时因相应字段为空会报错
# value must have a length > 0 # elasticsearch completion field does not support null values POST _reindex { "source": { "index": "twitter" }, "dest": { "index": "new_twitter" } } "script": { "source": "if (ctx._source.name == null || ctx._source.name == '') {ctx._source.tmname=' '}", "lang": "painless" }
kibana 30s 后返回 timeout ,在请求后跟参数 ?wait_for_completion=true
重新设置 refresh_interval 和 number_of_replicas
- 等待 index 状态 变为 green
- 添加 aliases
Reindex task
获取正在运行的 reindex
GET _tasks?detailed=true&actions=*reindex
查看单个 task
GET /_tasks/taskId:1
删除某个 task
POST _tasks/task_id:1/_cancel
null_value
null 值不能被索引和查询, null_value 可以明确的替换 null 值
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"status_code": {
"type": "keyword",
"null_value": "NULL"
}
}
}
}
}
- The null_value needs to be the same datatype as the field. For instance, a long field cannot have a string null_value.
- The null_value only influences how data is indexed, it doesn’t modify the _source document.
Completion Suggester
mapping
PUT music { "mappings": { "_doc" : { "properties" : { "suggest" : { "type" : "completion", "analyzer" : "simple", "search_analyzer" : "simple" }, "title" : { "type": "keyword" } } } } }
query ,_source 会影响查询效率
POST music/_search?pretty { "_source": "suggest", "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest" }, "size":10 } } }
Whether duplicate suggestions should be filtered out (defaults to false).
The autocomplete suggester is document-oriented in Elasticsearch 5 and will thus return suggestions for all documents that match. In Elasticsearch 1 and 2, the autocomplete suggester automatically de-duplicated suggestions.
POST music/_search?pretty { "_source": "suggest", "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest" }, "size":10, "skip_duplicates":true } } }
DSL
completion suggest sample
s = s.suggest('my_sugestion', keyword, completion={ 'field' : 'name', 'size':100})