Elasticsearch常用操作

Reindex

  1. 新建一个 index ,复制原 index 的 mappings 和 settings
  2. 设置新 index 的 refresh_interval 为-1, number_of_replicas 为 0, reindex 效率更高

    PUT /twitter/_settings
    {
        "index" : {
            "number_of_replicas" : 0
        }
    }
    
    PUT /twitter/_settings
    {
        "index" : {
            "refresh_interval" : "-1"
        }
    }
    
  3. reindex api

    • Just leaving out version_type (as above) or setting it to internal will cause Elasticsearch to blindly dump documents into the target, overwriting any that happen to have the same type and id:

      POST _reindex
      {
          "source": {
              "index": "twitter"
          },
          "dest": {
              "index": "new_twitter",
              "version_type": "internal"
          }
      }
      
    • Setting version_type to external will cause Elasticsearch to preserve the version from the source, create any documents that are missing, and update any documents that have an older version in the destination index than they do in the source index

      POST _reindex
      {
          "source": {
              "index": "twitter"
          },
          "dest": {
              "index": "new_twitter",
              "version_type": "external"
          }
      }
      
    • Settings op_type to create will cause _reindex to only create missing documents in the target index. All existing documents will cause a version conflic

      POST _reindex
      {
          "source": {
              "index": "twitter"
          },
          "dest": {
              "index": "new_twitter",
              "op_type": "create"
          }
      }
      
    • By default version conflicts abort the _reindex process but you can just count them by settings “conflicts”: “proceed” in the request body

      POST _reindex
      {
          "conflicts": "proceed",
          "source": {
              "index": "twitter"
          },
          "dest": {
              "index": "new_twitter",
              "op_type": "create"
          }
      }
      
    • 如果某个字段设置了 completion suggest ,重建索引时或则索引数据时因相应字段为空会报错

      # value must have a length > 0
      # elasticsearch completion field does not support null values
      
      POST _reindex
      {
          "source": {
              "index": "twitter"
          },
          "dest": {
              "index": "new_twitter"
          }
      }
      "script": {
          "source": "if (ctx._source.name == null || ctx._source.name == '') {ctx._source.tmname=' '}",
          "lang": "painless"
      }
      
    • kibana 30s 后返回 timeout ,在请求后跟参数 ?wait_for_completion=true

  4. 重新设置 refresh_interval 和 number_of_replicas

  5. 等待 index 状态 变为 green
  6. 添加 aliases

Reindex task

  • 获取正在运行的 reindex

    GET _tasks?detailed=true&actions=*reindex
    
  • 查看单个 task

    GET /_tasks/taskId:1
    
  • 删除某个 task

    POST _tasks/task_id:1/_cancel
    

null_value

null 值不能被索引和查询, null_value 可以明确的替换 null 值

PUT my_index
{
    "mappings": {
        "_doc": {
          "properties": {
            "status_code": {
              "type":       "keyword",
              "null_value": "NULL"
            }
          }
        }
    }
}
  • The null_value needs to be the same datatype as the field. For instance, a long field cannot have a string null_value.
  • The null_value only influences how data is indexed, it doesn’t modify the _source document.

Completion Suggester

  • mapping

    PUT music
    {
        "mappings": {
            "_doc" : {
                "properties" : {
                    "suggest" : {
                        "type" : "completion",
                        "analyzer" : "simple",
                        "search_analyzer" : "simple"
                    },
                    "title" : {
                        "type": "keyword"
                    }
                }
            }
        }   
    }
    
  • query ,_source 会影响查询效率

    POST music/_search?pretty
    {
        "_source": "suggest",
        "suggest": {
            "song-suggest" : {
                "prefix" : "nir",
                "completion" : {
                    "field" : "suggest"
                },
                "size":10
            }
        }
    }
    
  • Whether duplicate suggestions should be filtered out (defaults to false).

    The autocomplete suggester is document-oriented in Elasticsearch 5 and will thus return suggestions for all documents that match. In Elasticsearch 1 and 2, the autocomplete suggester automatically de-duplicated suggestions.

    POST music/_search?pretty
    {
        "_source": "suggest",
        "suggest": {
            "song-suggest" : {
                "prefix" : "nir",
                "completion" : {
                    "field" : "suggest"
                },
                "size":10,
                "skip_duplicates":true
            }
        }
    }
    

DSL

  • completion suggest sample

    s = s.suggest('my_sugestion', keyword, completion={ 'field' : 'name', 'size':100})