Welcome to the Beginner's Crash Course to Elastic Stack!
This repo contains all resources shared during Part 2: Understanding the relevance of your search with Elasticsearch and Kibana.
By the end of this workshop, you will:
- learn how Precision and Recall are used to measure how well Elastic search engine is searching
- understand how scoring is used to rank the relevance of your search results in Elasticsearch
- master how to send search queries from Kibana to Elasticsearch to finetune Precision or Recall of your search results
Beginner's Crash Course to Elastic Stack Table of Contents This workshop is a part of the Beginner's Crash Course to Elastic Stack series. Check out this table contents to access all the workshops in the series thus far. This table will continue to get updated as more workshops in the series are released!
Instructions on how to access Elasticsearch and Kibana on Elastic Cloud
Instructions for downloading Elasticsearch and Kibana
Video of the workshop
Mini Beginner's Crash Course to Elasticsearch & Kibana playlist
Do you prefer learning by watching shorter videos? Check out this playlist to watch short clips of beginner's crash course full length workshops. Part 2 workshop is broken down into episodes 7-10. Season 2 clips will be uploaded here in the future!
Blog on understanding the relevance of your search with Elasticsearch and Kibana
Elastic America Virtual Chapter Want to attend live workshops? Join the Elastic America Virtual Chapter to get the deets!
What's next? Eager to continue your learning after mastering the concept from this workshop? Move on to Part 3: Running full text queries and combined queries with Elasticsearch and Kibana here!
There are two main ways to search in Elasticsearch:
- Queries
- Aggregations
Queries retrieve documents that match the criteria.
Syntax:
GET enter_name_of_the_index_here/_search
Example:
GET news_headlines/_search
Expected response from Elasticsearch:
Elasticsearch displays a number of hits and a sample of 10 search results by default.
To improve the response speed on large datasets, Elasticsearch limits the total count to 10,000 by default. If you want the exact total number of hits, use the following query.
Syntax:
GET enter_name_of_the_index_here/_search
{
"track_total_hits": true
}
Example:
GET news_headlines/_search
{
"track_total_hits": true
}
Expected response from Elasticsearch:
You will see that the total number of hits is now 200,853.
Syntax:
GET enter_name_of_the_index_here/_search
{
"query": {
"Specify the type of query here": {
"Enter name of the field here": {
"gte": "Enter lowest value of the range here",
"lte": "Enter highest value of the range here"
}
}
}
}
Example:
GET news_headlines/_search
{
"query": {
"range": {
"date": {
"gte": "2015-06-20",
"lte": "2015-09-22"
}
}
}
}
Expected response from Elasticsearch:
It will pull up articles published from June 20, 2015 through September 22, 2015. A document from the result set was shown as an example.
An aggregation summarizes your data as metrics, statistics, and other analytics.
Syntax:
GET enter_name_of_the_index_here/_search
{
"aggs": {
"name your aggregation here": {
"specify aggregation type here": {
"field": "name the field you want to aggregate here",
"size": state how many buckets you want returned here
}
}
}
}
Example:
GET news_headlines/_search
{
"aggs": {
"by_category": {
"terms": {
"field": "category",
"size": 100
}
}
}
}
Expected response from Elasticsearch:
Syntax:
GET enter_name_of_the_index_here/_search
{
"query": {
"match": {
"Enter the name of the field": "Enter the value you are looking for"
}
},
"aggregations": {
"Name your aggregation here": {
"significant_text": {
"field": "Enter the name of the field you are searching for"
}
}
}
}
Example:
GET news_headlines/_search
{
"query": {
"match": {
"category": "ENTERTAINMENT"
}
},
"aggregations": {
"popular_in_entertainment": {
"significant_text": {
"field": "headline"
}
}
}
}
Expected response from Elasticsearch:
Syntax:
GET enter_name_of_index_here/_search
{
"query": {
"match": {
"Specify the field you want to search": {
"query": "Enter search terms"
}
}
}
}
Example:
GET news_headlines/_search
{
"query": {
"match": {
"headline": {
"query": "Khloe Kardashian Kendall Jenner"
}
}
}
}
Expected response from Elasticsearch:
By default, the match query uses an "OR" logic. If a document contains one of the search terms, Elasticsearch will consider that document as a hit.
"OR" logic results in higher number of hits, thereby increasing recall. However, the hits are loosely related to the query and lowering precision as a result.
We can increase precision by adding an "and" operator to the query.
Syntax:
GET enter_name_of_index_here/_search
{
"query": {
"match": {
"Specify the field you want to search": {
"query": "Enter search terms",
"operator": "and"
}
}
}
}
Example:
GET news_headlines/_search
{
"query": {
"match": {
"headline": {
"query": "Khloe Kardashian Kendall Jenner",
"operator": "and"
}
}
}
}
Expected response from Elasticsearch:
"AND" operator will result in getting more precise matches, thereby increasing precision. However, it will reduce the number of hits returned, resulting in lower recall.
This parameter allows you to specify the minimum number of terms a document should have to be included in the search results.
This parameter gives you more control over fine tuning precision and recall of your search.
Syntax:
GET enter_name_of_index_here/_search
{
"query": {
"match": {
"headline": {
"query": "Enter search term here",
"minimum_should_match": Enter a number here
}
}
}
}
Example:
GET news_headlines/_search
{
"query": {
"match": {
"headline": {
"query": "Khloe Kardashian Kendall Jenner",
"minimum_should_match": 3
}
}
}
}
Expected response from Elasticsearch:
With minimum_should_match parameter, we were able to finetune both precision and recall!