How to Use the in Source Query Parameter in Elasticsearch

Avatar

By squashlabs, Last Updated: October 25, 2023

How to Use the in Source Query Parameter in Elasticsearch

When executing a search query in Elasticsearch, by default, all fields of the matching documents are returned in the search results. However, in certain scenarios, you may not need all the fields, especially if the documents contain a large amount of data. This is where the source parameter comes into play. The source parameter allows you to specify a list of fields that should be included or excluded from the search results. By limiting the returned fields to only those that are necessary, you can improve the performance of your queries and reduce the amount of network traffic between Elasticsearch and your application.

Syntax for Querying in Elasticsearch

To understand how the source parameter works, let’s first take a look at the basic syntax for querying in Elasticsearch. Elasticsearch uses a query language called Query DSL (Domain Specific Language) to define search queries. Query DSL is a JSON-based syntax that allows you to express complex search criteria in a concise and readable manner.

Here’s an example of a simple search query in Elasticsearch using Query DSL:

GET /my_index/_search
{
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

In this example, we are searching for documents in the “my_index” index that have the term “elasticsearch” in the “title” field. The search results will include all the fields of the matching documents.

Related Article: How to Use the aria-label Attribute in HTML

Specifying the Source Query in Elasticsearch

To specify the source query in Elasticsearch, you can make use of the “_source” parameter within your search query. The “_source” parameter accepts a boolean value or an array of field names, indicating whether to include or exclude the specified fields from the search results.

Here’s an example of how to use the “_source” parameter to include specific fields in the search results:

GET /my_index/_search
{
  "_source": ["title", "author"],
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

In this example, we are searching for documents in the “my_index” index that have the term “elasticsearch” in the “title” field. However, we only want the “title” and “author” fields to be included in the search results. By specifying the fields in the “_source” parameter, Elasticsearch will only return these fields in the search results.

Exploring the Query DSL in Elasticsearch

Query DSL in Elasticsearch provides a wide range of options to construct complex search queries. It allows you to combine multiple query clauses, filter documents based on specific criteria, apply scoring functions, and more. Let’s explore some of the commonly used query clauses and features of Query DSL.

Match Query

The match query is one of the simplest and most commonly used query clauses in Elasticsearch. It allows you to search for documents that contain a specific term or phrase in a particular field. Here’s an example of how to use the match query:

GET /my_index/_search
{
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

In this example, we are searching for documents in the “my_index” index that have the term “elasticsearch” in the “title” field.

Bool Query

The bool query is a useful query clause that allows you to combine multiple query clauses using boolean logic. It supports must, must_not, and should clauses, which respectively define mandatory, prohibited, and optional criteria for the search results. Here’s an example of how to use the bool query:

GET /my_index/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "elasticsearch" } },
        { "range": { "year": { "gte": 2010 } } }
      ],
      "must_not": [
        { "match": { "category": "fiction" } }
      ],
      "should": [
        { "term": { "author": "John Doe" } },
        { "term": { "author": "Jane Smith" } }
      ]
    }
  }
}

In this example, we are searching for documents in the “my_index” index that meet the following criteria:
– The “title” field must contain the term “elasticsearch” and the “year” field must be greater than or equal to 2010.
– The “category” field must not contain the term “fiction”.
– The “author” field should contain either “John Doe” or “Jane Smith”.

Examples of Elasticsearch Queries

Let’s now explore some examples of Elasticsearch queries that demonstrate different use cases and scenarios.

Example 1: Searching for Documents with a Specific Field Value

Suppose we have an index called “products” that contains information about various products, including their names, descriptions, and prices. We want to search for products that have a specific price. Here’s how we can do it:

GET /products/_search
{
  "query": {
    "term": {
      "price": 100
    }
  }
}

In this example, we are searching for documents in the “products” index that have a price of 100. The search results will include all the fields of the matching documents.

Example 2: Searching for Documents with a Range of Values

Continuing with the previous example, suppose we want to search for products that have a price within a specific range, such as between 50 and 100. Here’s how we can do it:

GET /products/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 50,
        "lte": 100
      }
    }
  }
}

In this example, we are searching for documents in the “products” index that have a price greater than or equal to 50 and less than or equal to 100. The search results will include all the fields of the matching documents.

Related Article: Troubleshooting 502 Bad Gateway Nginx

Querying Elasticsearch Using the Source Parameter

Now that we understand the basics of the source parameter in Elasticsearch, let’s see how we can use it to control the fields returned in the search results.

Example 1: Including Specific Fields in the Search Results

Suppose we have an index called “employees” that contains information about employees, including their names, departments, and salaries. We want to search for employees in the “sales” department and only retrieve their names and salaries. Here’s how we can do it:

GET /employees/_search
{
  "_source": ["name", "salary"],
  "query": {
    "term": {
      "department": "sales"
    }
  }
}

In this example, we are searching for documents in the “employees” index that have the term “sales” in the “department” field. However, we only want the “name” and “salary” fields to be included in the search results. By specifying the fields in the “_source” parameter, Elasticsearch will only return these fields in the search results.

Example 2: Excluding Specific Fields from the Search Results

Continuing with the previous example, suppose we want to search for employees in the “sales” department and exclude their salaries from the search results. Here’s how we can do it:

GET /employees/_search
{
  "_source": {
    "includes": ["name", "department"]
  },
  "query": {
    "term": {
      "department": "sales"
    }
  }
}

In this example, we are searching for documents in the “employees” index that have the term “sales” in the “department” field. However, we want to exclude the “salary” field from the search results. By specifying the “includes” parameter in the “_source” parameter, Elasticsearch will only return the specified fields in the search results.

Different Options for Querying in Elasticsearch

In addition to the source parameter, Elasticsearch provides various options for querying and retrieving data from your indices. Here are some of the different options available:

Field Queries

Field queries allow you to search for documents based on specific field values. Elasticsearch provides several types of field queries, such as term, match, range, and more. These queries can be used to retrieve documents that match specific criteria, such as a specific term in a particular field or a range of values.

Full-Text Queries

Full-text queries are used to search for documents based on their textual content. Elasticsearch provides useful full-text capabilities, such as fuzzy matching, stemming, and relevance scoring. Full-text queries can be used to search for documents that contain a specific term or phrase in one or more fields.

Aggregations

Aggregations in Elasticsearch allow you to perform analysis and computations on your data, such as calculating average values, grouping documents by specific criteria, and more. Aggregations can be used to generate meaningful insights from your data and extract valuable information.

Sorting and Pagination

Elasticsearch provides options for sorting and paginating the search results. You can specify the order in which the search results should be sorted, based on one or more fields. Additionally, you can define the size of the search results to limit the number of documents returned.

Querying in Elasticsearch Without a Specific Language

Elasticsearch provides multiple ways to query your data without relying on a specific programming language or framework. In addition to using the RESTful API directly, you can interact with Elasticsearch using various tools and libraries, such as Kibana, the official Elasticsearch client libraries, and third-party integrations.

Kibana

Kibana is a useful data exploration and visualization tool that is tightly integrated with Elasticsearch. With Kibana, you can easily create and execute queries, visualize search results, and build interactive dashboards to monitor and analyze your data. Kibana provides a user-friendly interface that allows you to perform complex queries and explore your data without writing any code.

Elasticsearch Client Libraries

Elasticsearch provides official client libraries for various programming languages, including Java, Python, JavaScript, and more. These client libraries provide a convenient way to interact with Elasticsearch, allowing you to execute search queries, index documents, and perform other operations programmatically. Using client libraries, you can integrate Elasticsearch into your applications and leverage its querying capabilities.

Third-Party Integrations

Elasticsearch has a vibrant ecosystem of third-party integrations and plugins that extend its functionality and make it easier to query and analyze your data. These integrations include frameworks, libraries, and tools that provide additional features, such as advanced analytics, machine learning, and data visualization. By leveraging these integrations, you can enhance your querying experience and gain deeper insights into your data.

Related Article: How to Do Sorting in C++ & Sorting Techniques

Purpose of the Source Parameter in Elasticsearch

The source parameter in Elasticsearch serves two main purposes:

1. Performance Optimization: By specifying the fields to be included or excluded from the search results, you can reduce the amount of data transferred over the network between Elasticsearch and your application. This can significantly improve the performance of your queries, especially when dealing with large volumes of data or slow network connections.

2. Data Privacy and Security: In certain scenarios, you may not want to expose all fields of your documents in the search results. The source parameter allows you to control the visibility of sensitive or confidential information by excluding specific fields from the search results. This ensures that only the necessary information is returned to the client applications.

Constructing Complex Queries in Elasticsearch

Elasticsearch provides a rich set of querying capabilities that allow you to construct complex queries to retrieve the desired data from your indices. By combining different query clauses, filters, aggregations, and sorting options, you can build sophisticated queries that meet your specific requirements.

Here’s an example of a complex query in Elasticsearch that combines multiple query clauses and filters:

GET /my_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "elasticsearch"
          }
        },
        {
          "range": {
            "year": {
              "gte": 2010
            }
          }
        }
      ],
      "filter": {
        "term": {
          "category": "technology"
        }
      }
    }
  },
  "sort": [
    {
      "year": {
        "order": "desc"
      }
    }
  ],
  "size": 10
}

In this example, we are searching for documents in the “my_index” index that meet the following criteria:
– The “title” field must contain the term “elasticsearch” and the “year” field must be greater than or equal to 2010.
– The “category” field must be equal to “technology”.
– The search results will be sorted in descending order based on the “year” field.
– Only the top 10 matching documents will be returned.

You May Also Like

What is Test-Driven Development? (And How To Get It Right)

Test-Driven Development, or TDD, is a software development approach that focuses on writing tests before writing the actual code. By following a set of steps, developers... read more

16 Amazing Python Libraries You Can Use Now

In this article, we will introduce you to 16 amazing Python libraries that are widely used by top software teams. These libraries are powerful tools that can enhance... read more

Agile Shortfalls and What They Mean for Developers

What is the best software development methodology to use? This question is the topic of hot debate during the project implementation stage. However, what you choose... read more

24 influential books programmers should read

The fast-paced world of programming demands that people remain up-to-date. In fact, getting ahead of the curve makes a programmer stand out in his professional field.... read more

The issue with Monorepos

A monorepo is an arrangement where a single version control system (VCS) repository is used for all the code and projects in an organization. In this article, we will... read more

The most common wastes of software development (and how to reduce them)

Software development is a complex endeavor that requires much time to be spent by a highly-skilled, knowledgeable, and educated team of people. Often, there are time... read more