Elasticsearch v7.6
In the previous blog we searched for a match_all, and sorted it by age. In this we will improve upon the query we are writing and look for other options.
GET profile/_search { "query": { "match": { "title": "Mr. Ms." } }, "sort": [ { "age": "asc" } ], "size": 3 }
The command executed above, uses match
which allows to search for specific terms within a field title
and the terms we are looking for are Mr.
or Mrs.
The response as received is as under
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 10, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "profile", "_type" : "_doc", "_id" : "4", "_score" : null, "_source" : { "name" : "Deepa G", "age" : 22, "title" : "Ms.", "role" : "QA", "org" : "Security" }, "sort" : [ 22 ] }, { "_index" : "profile", "_type" : "_doc", "_id" : "6", "_score" : null, "_source" : { "name" : "Smdie G", "age" : 24, "title" : "Mr.", "role" : "Program management", "org" : "Security" }, "sort" : [ 24 ] }, { "_index" : "profile", "_type" : "_doc", "_id" : "7", "_score" : null, "_source" : { "name" : "Amdie G", "age" : 24, "title" : "Mr.", "role" : "Program management", "org" : "Security" }, "sort" : [ 24 ] } ] } }
match_phrase
GET profile/_search { "query": { "match_phrase": { "name": "G" } }, "sort": [ { "age": "asc" } ], "size": 2 }
In the example above I am trying to look for a phrase G
rather than individual term.
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 8, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "profile", "_type" : "_doc", "_id" : "4", "_score" : null, "_source" : { "name" : "Deepa G", "age" : 22, "title" : "Ms.", "role" : "QA", "org" : "Security" }, "sort" : [ 22 ] }, { "_index" : "profile", "_type" : "_doc", "_id" : "5", "_score" : null, "_source" : { "name" : "Reepa G", "age" : 24, "title" : "Mrs.", "role" : "QA", "org" : "Security" }, "sort" : [ 24 ] } ] } }
I have limited the size to 2 and hence even though it matched more than 2 documents, the result size was limited to 2.
Complex Query
Creating a complex query is equally intuitive.
Bool – Query
Let’s look for all the users who are between the age group of >= 30 && <= 50
.
GET profile/_search { "query": { "bool": { "must": [ { "range": { "age": { "gte": 30, "lte": 50 } }} ] } }, "sort": [ { "age": "asc" } ], "size": 2 }
The response as expected is
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "profile", "_type" : "_doc", "_id" : "11", "_score" : null, "_source" : { "name" : "Veronica G", "age" : 37, "title" : "Ms.", "role" : "Engineering", "org" : "Security" }, "sort" : [ 37 ] }, { "_index" : "profile", "_type" : "_doc", "_id" : "10", "_score" : null, "_source" : { "name" : "Pranav G", "age" : 47, "title" : "Mr.", "role" : "Engineering", "org" : "Security" }, "sort" : [ 47 ] } ] } }
A simple modifications to exclude Veronica
is as under.
must_not clause is more of a filter.
GET profile/_search { "query": { "bool": { "must": [ { "range": { "age": { "gte": 30, "lte": 50 } }} ], "must_not": [ { "match": { "name": "Veronica G" } } ] } }, "sort": [ { "age": "asc" } ], "size": 2 }
Aggregating
Let’s show aggregation
GET profile/_search { "aggs": { "Group-By-Age": { "terms": { "field": "title.keyword" } } }, "size": 0 }
and the results are
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "Group-By-Age" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "Mr.", "doc_count" : 8 }, { "key" : "Ms.", "doc_count" : 2 }, { "key" : "Mrs.", "doc_count" : 1 } ] } } }
What are these buckets?
"buckets" : []
In our example we are aggregating for title.keyword
and the key
shows the unique values found and the doc_count
is the total matches.
So for our response we have 8 Mr.
, 2 Ms.
and 1 `Mrs.`
If the size
is not zero in the query all the responses that matched will also be returned in the hits[]
Complexity increased
A little modifications to further aggregate and group by role.
GET profile/_search { "aggs": { "Group-By-Age": { "terms": { "field": "title.keyword" }, "aggs": { "Group-By-Role": { "terms": { "field": "role.keyword" } } } } }, "size": 0 }
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "Group-By-Age" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "Mr.", "doc_count" : 8, "Group-By-Role" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "Program management", "doc_count" : 2 }, { "key" : "Engineering", "doc_count" : 1 }, { "key" : "Engineering management", "doc_count" : 1 }, { "key" : "Lead", "doc_count" : 1 }, { "key" : "Lead Engr", "doc_count" : 1 }, { "key" : "Manager", "doc_count" : 1 }, { "key" : "Product management", "doc_count" : 1 } ] } }, { "key" : "Ms.", "doc_count" : 2, "Group-By-Role" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "Engineering", "doc_count" : 1 }, { "key" : "QA", "doc_count" : 1 } ] } }, { "key" : "Mrs.", "doc_count" : 1, "Group-By-Role" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "QA", "doc_count" : 1 } ] } } ] } } }