Query data

A search query, or query, is a request for information about data in LambdaDB collections.

LambdaDB supports several search methods:

Search for exact values: search for exact values or ranges of numbers, dates, IPs, or strings.
Full-text search: use full text queries to query unstructured textual data and find documents that best match query terms.
Vector search: store vectors in LambdaDB and use approximate nearest neighbor (ANN) to find vectors that are similar, supporting use cases like semantic search.

Common parameters for the request body of a query are as follows:

Parameter	Description	Type	Required	Default
size	The number of results to return for each query	integer	✓
query	Query object (details in later part)	object	✓
includeVectors	Indicates whether vector values are included in the response	boolean		false
consistentRead	Determines the read consistency model: If set to true, then the operation uses strongly consistent reads; otherwise, the operation uses eventually consistent reads.	boolean		false

Depending on your data and your query, you may get fewer than size results. This happens when size is larger than the number of possible matching documents for your query.

LambdaDB is eventually consistent, so there can be a slight delay before new or changed documents are visible to queries. If your application requires strongly consistent read, set consistentRead to true when query your data, at the expense of potential high latency and cost.

Query results are returned in the following format:

Field	Description	Type
took	Milliseconds it took LambdaDB to execute the request	long
maxScore	Highest returned document `score`	float
total	Metadata about the number of matching documents	long
docs	Contains returned documents and metadata	object[]

Example query request to LambdaDB is as follows:

python

url = f"https://{baseUrl}/projects/{projectName}/collections/{collectionName}/query"
headers = {"content-type" : "application/json", "x-api-key" : f"{YOUR_API_KEY}"}

query = {
    "size": 10,
    "query": { }
}

r = requests.post(url, headers=headers, json=query)

The response will look like this:

{
  "took": 76,
  "maxScore": 10.691499,
  "total": 1,
  "docs": [
    {
      "collection": "example_collection",
      "score": 10.691499,
      "doc": {
        "id": "33201222",
        "url": "https://en.wikipedia.org/wiki/LambdaDB",
        "title": "LambdaDB is awesome",
        "text": null
      }
    }
  ]
}

Matched documents are ordered by similarity from most similar to least similar by default. Similarity is expressed as a score, and it is calculated based on BM25 algorithm for full-text search and a configured similarity metric for vector search.

Query string query

LambdaDB supports a simple query string query that uses the Lucene Query Syntax to search documents. Specifically, term, phrase, wildcard, boolean operators, range operators, and special character escaping are supported.

Parameter	Description	Type	Required
query	Query string	object	✓
defaultField	Default field name to apply query	string

A query to a collection with object field type requires special handling.

An example index configurations:

{
  "metadata": {
    "type": "object",
    "objectMapping": {
      "url": {
        "type": "keyword"
      },
      "author": {
        "type": "keyword"
      }
    }
  }
}

An example query:

{
  "queryString": {
    "query": "metadata.author:Alan"
  }
}

Vector query

A k-nearest neighbor (kNN) search finds the k nearest vectors to a query vector, as measured by a similarity metric.

Parameter	Description	Type	Required
field	The name of the vector field to search against	string	✓
queryVector	Query vector	float[]	✓
k	Number of nearest neighbors to return as top `docs`	integer	✓
filter	Query to filter the documents that can match	object

An example vector query:

{
  "knn": {
    "field": "example_vector_field",
    "queryVector": [
        0.030255454,
        -0.058824085,
        -0.065448694,
        -0.03987034,
        0.060786933,
        -0.15469691,
        -0.043918714,
        0.057719983,
        0.054530356,
        0.007080819
    ],
    "k": 5
  }
}

An example vector query with a filter for better performance:

{
  "knn": {
    "filter" : {
      "queryString": {
          "query": "node_type:NODE AND \"https://example.com/books/5514276\"",
          "defaultField": "metadata.url"
      }
    },
    "field": "example_vector_field",
    "queryVector": [
        0.030255454,
        -0.058824085,
        -0.065448694,
        -0.03987034,
        0.060786933,
        -0.15469691,
        -0.043918714,
        0.057719983,
        0.054530356,
        0.007080819
    ],
    "k": 5
  }
}

Sparse vector query

The sparse vector query executes a search using sparse vector representations, typically generated by learned sparse retrieval models. You must provide precalculated token-weight pairs as your query vectors, where each pair represents a term and its corresponding relevance score. Currently, LambdaDB does not support built-in natural language processing models for automatic sparse vector generation.

Parameter	Description	Type	Required
field	The name of the vector field to search against	string	✓
queryVector	Query vector	object	✓

An example sparse vector query:

{
    "sparseVector": {
        "field": "example_sparse_field",
        "queryVector": {
            "LambdaDB": 0.5,
            "awesome": 0.3,
            "AI": 0.2
        }
    }
}

If you prefer to use the index-based format with separate values and index arrays, you can specify index positions as keys in the queryVector object:

{
    "sparseVector": {
        "field": "example_sparse_field",
        "queryVector": {
            "10": 0.5,
            "45": 0.3,
            "234": 0.2
        }
    }
}

Boolean query

A query that matches documents matching boolean combinations of other queries. The boolean query maps to Lucene BooleanQuery. It is built using one or more boolean clauses, each clause with a typed occurrence. The occurrence types are:

Occur	Description
filter	The clause (query) must appear in matching documents. However unlike `must` the score of the query will be ignored. Each query defined under a `filter` acts as a logical “AND”, returning only documents that match all the specified queries.
must	The clause (query) must appear in matching documents and will contribute to the score. Each query defined under a `must` acts as a logical “AND”, returning only documents that match all the specified queries.
must_not	The clause (query) must not appear in the matching documents. Each query defined under a `must_not` acts as a logical “NOT”, returning only documents that do not match any of the specified queries.
should	The clause (query) should appear in the matching document. Each query defined under a `should` acts as a logical “OR”, returning documents that match any of the specified queries.

Below is an example of a boolean query:

{
  "bool": [
    {
      "queryString": {
        "query": "node_type:NODE"
      },
      "occur" : "filter"
    },
    {
      "queryString": {
        "query": "content:LambdaDB"
      },
      "occur" : "should"
    }
  ]
}

This query will return top documents that are both of type NODE and have the highest score value.

occur parameter is optional and defaults to should.

Boost query

Boost values that are less than one will give less importance to this query compared to other ones while values that are greater than one will give more importance to the scores returned by this query.

The boost value must be greater than zero.

{
  "bool": [
    {
      "queryString": {
        "query": "node_type:NODE"
      },
      "boost": 0.8
    },
    {
      "queryString": {
        "query": "content:LambdaDB"
      },
      "boost": 0.2
    }
  ]
}

The final score of the matched document is calculated as proportional to the boost values.

Hybrid query

You can mix the aforementioned queries with a vector query to get better relevance.

LambdaDB supports three rescoring methods to combine results from different query types: rrf (Reciprocal Rank Fusion), mm(min-max), and l2(l2_norm). RRF combines rankings by taking the reciprocal of each result’s rank position, providing balanced weighting across different search methods. MinMax normalization scales scores to a 0-1 range before combining, while L2 norm uses Euclidean distance-based normalization to merge relevance scores from multiple query sources.

Regardless of the rescoring method used, the final combined score is always normalized to a value between 0 and 1.

The boost parameter is only available for minmax and l2_norm rescoring methods. When using boost parameters, the sum of all boost values must equal 1, and each individual boost value must be between 0 and 1.

An example hybrid query:

{
  "l2": [
    {
      "queryString": {
          "query": "content:LambdaDB"
      },
      "boost" : 0.7
    },
    {
      "knn": {
        "filter" : {
         "queryString": {
            "query": "\"https://lambdadb.ai\"",
            "defaultField": "metadata.url"
          }
        },
        "field": "text_embedding",
        "queryVector": [ ... {float array} ... ],
        "k": 5
      },
    "boost" : 0.3
    }
  ]
}

To explain the above example, it works as follows:

A total of two queries are combined into a l2 rescoring method, performing a hybrid query that combines a queryString and a knn.

The score of documents matching in the queryString is boosted by 0.7 during the calculation.

In the knn, a pre-filtering is performed using a filter, followed by a vector query that calculates the scores of the top 5 most similar documents with a boost of 0.3.

The final returned documents may not include the requested number of documents from the knn query. This is because the scores of documents returned solely from the queryString may be higher than those of the top 5 documents returned from the knn.

Sort search results

LambdaDB allows you to add one or more sorts on specific fields. Each sort can be reversed as well. The sort is defined on a per field level.

Sorting can be done on the following data types:

long
double
datetime

The sort order option can have the following values:

asc: sorts in ascending order
desc: sorts in descending order

Assuming the following index configurations:

{
  "indexConfigs": {
    "post_date": {
      "type": "datetime"
    },
    "user": {
      "type": "keyword"
    },
    "name": {
      "type": "keyword"
    },
    "age": {
      "type": "integer"
    }
  }
}

You can sort search results as follows:

{
  "query" : {
    "queryString" : { "query" : "name:LambdaDB" }
  },
  "sort" : [
    { "post_date" : "desc" },
    { "age" : "desc" }
  ]
}

Query limits

Metric	Limit
Max `size` value	100
Max result size	6MB
Max execution time	30s
Max boolean clauses	4,096
Max non-zero values for sparse vector query	4,096
Value range for sparse vector query	0-64

The query result size is affected by the dimension of the vectors and whether vector values are included in the result.

If a query fails due to exceeding the 6MB result size limit, choose a lower size value, or includeVectors=false to exclude vector values from the result.

Get started

Collections

Data

Query string query

Vector query

Sparse vector query

Boolean query

Boost query

Hybrid query

Sort search results

Query limits

Get started

Collections

Data

​Query string query

​Vector query

​Sparse vector query

​Boolean query

​Boost query

​Hybrid query

​Sort search results

​Query limits

Query string query

Vector query

Sparse vector query

Boolean query

Boost query

Hybrid query

Sort search results

Query limits