text
indexes are best suited for unstructured but human-readable content. If you need to index structured content such as email addresses, hostnames, status codes, or tags, you should rather use a keyword
index.
LambdaDB supports four analyzers for tokenization: standard
(default), korean
, japanese
, english
. You can specify multiple analyzers to a single text field to improve search performance.
keyword
type is used for structured content such as IDs, email addresses, hostnames, status codes, zip codes, or tags. Keyword indexes are often used in sorting, aggregations, and term-level queries.
long
indexes are optimized for scoring, sorting, and range queries.
double
indexes are optimized for scoring, sorting, and range queries.
boolean
indexes accept JSON true and false values, but can also accept strings which are interpreted as either true or false.
datetime
indexes are optimized for sorting and range queries.
vector
type indexes dense vectors of numeric values. vector
indexes are primarily used for k-nearest neighbor (kNN) search. The vector type does not support aggregations or sorting. You add a vector field as an array of numeric values.
A kNN search finds the k nearest vectors to a query vector, as measured by a similarity metric. LambdaDB supports four similarity metrics: euclidean
, dot_product
, cosine
, max_inner_product
. You can define the vector similarity to use in kNN search.
LambdaDB also supports multi-field vector search
, allowing you to perform kNN searches across multiple vector fields simultaneously within a single query. This enables complex semantic search scenarios where you can combine different types of embeddings (e.g., text embeddings, image embeddings) in one search operation.
sparseVector
type is designed for storing and indexing sparse vectors, where most elements are zero or missing. Unlike dense vectors, sparse vectors only store non-zero values along with their corresponding indexes. sparseVector
type only supports dot_product
distance metrics.
objectIndexConfigs
should be specified in order to index the fields inside the object.
Analyzer | Description |
---|---|
standard | Default general-purpose analyzer |
english | English language analyzer |
korean | Korean language analyzer |
japanese | Japanese language analyzer |
Metric | Description | Use Case |
---|---|---|
cosine | Cosine similarity | Most common for text embeddings |
euclidean | Euclidean distance | Geometric distance calculations |
dot_product | Dot product similarity | Fast similarity computation |
max_inner_product | Maximum inner product | Specialized similarity metric |