# match_phrase 的 slop

# 语法

GET /forum/article/_search
{
    "query": {
        "match_phrase": {
            "content": {
                "query": "java spark",
                "slop":  1
            }
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11

# slop 的含义是什么?

query string 搜索文本中的几个 term,要经过几次移动才能与一个 document 匹配,这个移动的次数,就是 slop

举个例子:

这样一段文本:hello world, java is very good, spark is also very good.
使用 match_phrase 搜索 java spark 搜不到
如果我们指定了slop,那么就允许 java spark 进行移动,来尝试与 doc 进行匹配

java    is    very    good    spark   is

java    spark
java      --> spark
java              --> spark
java                      -->  spark

上面展示了,当固定第一个 term 的时候,后面的 teram 经过移动直到匹配上搜索词的经过
这个移动的次数就是 slop
1
2
3
4
5
6
7
8
9
10
11
12
13

TIP

slop 只指最大移动次数

# 验证 slop

GET /forum/article/_search
{
    "query": {
        "match_phrase": {
            "content": {
                "query": "java spark",
                "slop":  2
            }
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11

响应结果

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.1324264,
    "hits": [
      {
        "_index": "forum",
        "_type": "article",
        "_id": "7",
        "_score": 1.1324264,
        "_source": {
          "content": "java spark are very related, because scala is spark's programming language and scala is also based on jvm like java."
        }
      },
      {
        "_index": "forum",
        "_type": "article",
        "_id": "8",
        "_score": 0.21395226,
        "_source": {
          "content": "java are spark very related, because scala is spark's programming language and scala is also based on jvm like java."
        }
      }
    ]
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

尝试着把 slop 的数值调整大一点,之前有好多条数据中都包含了 java 和 spark, 你会发现靠得越近的(slop 相对小的)得分会越高

其实,加了 slop 的 phrase match,就是 proximity match(近似匹配)

  • java spark,短语,doc,phrase match
  • java spark,可以有一定的距离,但是靠的越近,越先搜索出来,proximity match