API Query objects

LoanPro uses ElasticSearch to perform a large portion of searches. This means that the queries used by LoanPro closely resemble the queries used by Elastic search. This article will go over how to use the query objects found throughout the LoanPro API.

By default, queries only return the first ten results. However, this can be overridden by  using the $top and $start variables as detailed in the article API – Paginating Results.

Basic Format

The format of the query objects are as follows:

"query": {
        "query": {
            <query data>
        }
    }
It may appear redundant to have a “query” object inside of a “query” object, but this allows the LoanPro API to easily integrate into the power of Elasticsearch by sending all contents of the outer “query” object directly to Elasticsearch. By having it formatted this way, the LoanPro API will be able to easily expand to new features and functionality that is added to Elasticsearch.
The inner query object is the Elasticsearch query context. Everything inside of this object will be formatted according to the Elasticsearch documentation. Inside you can place multiple queries. This document will not go over everything as it can be found on the Elasticsearch website.
Results from queries are assigned a score, and this score is used to determine the rankings. LoanPro will automatically filter out rankings too low to be considered useful. However, most queries should be designed to be a match or not match; very rarely will you want a “sort of match” when dealing with loans. (Ex. if you’re looking for loans with an APR of 5%, you don’t want to get loans with an APR of 4.8% even though the difference is very small.)
Please pay attention to the queries as you make them. They must be valid JSON; if the JSON becomes invalid then the behavior of the API becomes undefined (usually it will return results you didn’t want, but that’s not a guarantee).
Note: Due to the complexity of Elasticsearch and the nature of loan data, only a handful of features are officially supported by LoanPro. The official features are the only ones with a guarantee of working funcitonality, all other features are not guaranteed to provide desired results or even be recognized or usable. As a result, it is strongly recommended to only use the officially supported features. However, all features are mentioned for completeness.

Query Types

Below is a list of the basic query types:

  • matchOfficially supported by LoanPro; determines whether or not the specified field matches the provided value (for text, it sees if all the words in the value match any word in the specified field)
  • multi_match – determines whether or not the specified fields match the provided value
  • common – performs a preliminary query with less-commonly used words and then an adjustment with more commonly used words.
    • Ex. in the query “The brown fox” it will first perform a query for “brown fox” and then after it receives the results it will perform and adjustment query (on the first query’s results) for “the”
  • query_string – Officially supported by LoanPro uses a query parser that exists inside of Elasticsearch to perform a query
  • simple_query_string – also uses a query parser that exists inside of Elasticsearch to perform a query

The fields to match against are the same fields that exist in a loan. Due to most of those being numerical, the “match” query type will suffice for almost all queries. As a result, the match query type is the only query type officially supported by LoanPro. All other query types are not officially supported and are to be used at your own risk.

Compound Queries

Compound queries are used to combine multiple queries and even provide logical operations. This allows for performing queries such as “Find all loans where the Loan Status is Active but the Loan Sub Status is not Recently Activated”. Each compound query is composed of clauses or collections of either more compound queries or of the basic query types. All clauses are evaluated from the innermost layers to the outermost layers with results being brought out. This can be though of as parenthesis in math equations: the inner-most nested parenthesis are evaluated first and their results are used in the surrounding equation.

Again, due to the structure of the data that will be searchable, only a small part of compound queries are officially supported. The supported compound queries will be marked as officially supported. All of them are mentioned here for completeness.

  • constant_score – returns a score modifier for every match
  • bool – Officially supported by LoanPro; returns whether or not a match was done based off of one or more boolean clauses. The clauses are as follows:
    • must – The clause must be matched
    • filter – Restricts results to match the clause
    • should – The clause should be matched; unless it is the only clause type than all inside queries are optional, but the higher the match the better the ranking
    • must_not – the clause must not match
  • dis_max – performs two or more queries and then combines the matching documents. Ex. “Get all loans with an APR of 5% or an APR of 10%”
  • function_score – applies a score modifier to matches based off of some function
  • boosting – applies a score modifier to matches
  • indices – allows performing two different queries based on indices
  • and – matches documents by using the AND boolean operator
  • not – matches documents by using the NOT boolean operator
  • or – matches documents by using the OR boolean operator
  • filtered – deprecated. Replaced with the bool operator
  • limit -deprecated

Almost all of the queries ran by LoanPro use the “bool” compound query and it’s child clauses exclusively. By nesting bool queries and match queries, almost every query can be replicated. This can be proven in boolean arithmetic.

Proving the Implementation

Since the output of a bool compound query is either True or False, it can be simplified to the output of a boolean operation. Variables (such as A,B,C,D,etc) will be used to represent the test of a loan against a query (ex. A means “does the loan match query A”). Due to AND, OR, and NOT being the basic gates from which all other gates can emerge, and OR can be derived from AND and NOT using DeMorgan’s Law (which gives NOT(NOT A AND NOT B)), we can thereby prove that as long as we can perform a NOT operation and an AND operation we can thereby perform the needed series of operations to do any “full” match (where something either does or doesn’t match, not “kind of matches”; this will suffice for almost all searches).

Since the must will ensure that all sub-clauses match, it can be simplified into an AND operation. Therefore, by having two or more “match” clauses nested inside a “must” clause we have created our AND operation. The next step is to have a NOT operation. Bool queries provide a “must_not” clause which carries a NOT operation into all clauses. This means that any “match” clause nested inside becomes “must not match” and is thereby negated, proving we have a NOT operation. Since we have previously proved that we can get an OR operation from both AND and NOT operations, we have now proven that by just supporting the “bool” and “match” queries we can perform any query that does “full” matching, which is more than sufficient for loans. Please note that this is not the same as “exact” matching where text searched for has to 100% the data; when searching for Elasticsearch does break the words apart and searches for the existence of words in the data.

Making a Query

Below is a sample query that will pull all loans that are not both active and have a loanStatusId of 2 (if they only meet one of those criteria they are still pulled).

    "query": {
        "query": {
            "bool": {
                "must_not":
                [
                    {
                        "bool":
                        {
                            "must":
                            [
                                {
                                    "match":
                                    {
                                        "loanStatusId": "2"
                                    }
                                },
                                {
                                    "match": {
                                        "active": "1"
                                    }
                                }
                            ]
                        }
                    }
                ]
            }
        }
    }
We will now go over the basic parts of the query object.
  • bool – this says that any result must match the series of queries contained inside
    • must_not – this says that results must not match the contained items
      • bool – any results must not match what what the queries contain inside
        • must – results to exclude must match all of the following parameters
          • match:loanStatusId – results to exclude must have their loan status id be 2
          • match:active – results to exclude must be for active loans

In other words, the query says “Return all results that do not have BOTH a loan status id of 2 AND that are active”.

It can also be written with boolean math as:  ( NOT ( loanStatusId = 2 AND active = 1 ) ).

Composing a Query

We will now go over how to compose a query. The very first step is to state, in very simple, clear, and precise English, what you want. For example, to pull a query giving all loans that have a loan status id of 4 or that are not active, we would say “Return all results that have EITHER a loan status id of 4 OR are not active”. We will then translate this to boolean math as follows:

  • Return all results that have – this becomes a set a parenthesis
  • EITHER – This indicates that we will be doing an OR in the future
  • a loan status of 4 – this is a constraint we’re going to use
  • OR – we will be placing an OR here
  • are not active – We could split this into a series of query elements, but the best option is have it be a constraint (active = 0)

We will now write it out

(loanStatusId = 4 OR active = 0)

Now, we need to figure out what to do about the OR. We could use DeMorgan’s law, or we could take advantage of the “should” part of a bool (remember that as long as a “should” is alone, at least one of the queries need to be matched, thus making it act like an OR).

We start off with our query body:

query:{}

Next, we add in a bool for the parenthesis:

query:{bool:{}}

Since we’re doing an OR, we’ll add in a “should” to the bool:

query:{bool:{should:[]}}

Next, we need to add the loanStatusId constraint to the should

query:{
   bool:{
      should:[
         {
            match:{
               "loanStatusId":4
            }
         }
      ]
   }
}

Next, we’ll add the active constraint:

query:{
   bool:{
      should:[
         {
            match:{
               "loanStatusId":4
            },
            match:{
               "active":0
            }
         }
      ]
   }
}

Congratulations! You have built your first query!

Related Articles

Leave A Comment?