# fuzzy 纠错模糊搜索技术

# 什么是 fuzzy 纠错

考虑一个场景:搜索的时候,可能输入的搜索文本会出现误拼写的情况

doc1: Surprise me
doc2: I wasn't surprised
1
2

搜索:surprize(手抖打错了最后两个字母) ,正常情况下是一条数据都搜索不到的

fuzzy 搜索技术自动将拼写错误的搜索文本,进行纠正,纠正以后去尝试匹配索引中的数据

# fuzzy 语法

插入测试数据

POST /my_index/my_type/_bulk
{ "index": { "_id": 1 }}
{ "text": "Surprise me!"}
{ "index": { "_id": 2 }}
{ "text": "That was surprising."}
{ "index": { "_id": 3 }}
{ "text": "I wasn't surprised."}
1
2
3
4
5
6
7

使用 fuzzy 查询语法

GET /my_index/my_type/_search
{
  "query": {
    "fuzzy": {
      "text": {
        "value": "surprize",
        "fuzziness": 2
      }
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
  • fuzzy 搜索以后,会自动尝试将你的搜索文本进行纠错,然后去跟文本进行匹配
  • fuzziness,你的搜索文本最多可以纠正几个字母去跟你的数据进行匹配,默认值为 2

响应结果

"hits": {
  "total": 2,
  "max_score": 0.22585157,
  "hits": [
    {
      "_index": "my_index",
      "_type": "my_type",
      "_id": "1",
      "_score": 0.22585157,
      "_source": {
        "text": "Surprise me!"
      }
    },
    {
      "_index": "my_index",
      "_type": "my_type",
      "_id": "3",
      "_score": 0.1898702,
      "_source": {
        "text": "I wasn't surprised."
      }
    }
  ]
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

还有另外一种用法,在 match 中使用

# query match 中使用 fuzziness

GET /my_index/my_type/_search
{
  "query": {
    "match": {
      "text": {
        "query": "SURPIZE ME",
        "fuzziness": "AUTO",
        "operator": "and"
      }
    }
  }
}
1
2
3
4
5
6
7
8
9
10
11
12

选择 atuo,自动纠错