Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.lambdadb.ai/llms.txt

Use this file to discover all available pages before exploring further.

Use the LambdaDB Migration CLI to migrate an Elasticsearch index into LambdaDB. The CLI inventories the Elasticsearch mapping, generates an editable LambdaDB mapping, creates the target LambdaDB collection when needed, reads documents with Elasticsearch point-in-time pagination, writes LambdaDB documents, saves local checkpoints, and validates migrated records before cutover.
Elasticsearch support is available in LambdaDB Migration CLI v0.1.5 and later. If your installed CLI does not show lambdadb-migration elasticsearch --help, install a newer release before following this guide.
Elasticsearch stores records as JSON documents inside indexes. Search behavior is controlled by mappings, analyzers, dense vector fields, query DSL, ingest pipelines, aliases, and cluster-level settings. LambdaDB stores each migrated record as a document and indexes only the fields declared in the collection’s indexConfigs.

What the CLI supports

Elasticsearch sourceLambdaDB targetMigration behavior
IndexCollectionMigrate one Elasticsearch index into one LambdaDB collection.
_idDocument idCopy Elasticsearch _id as a LambdaDB string document ID.
_source fieldsDocument fieldsCopy stored fields. Nested objects are flattened into dot-path payload fields, then normalized for LambdaDB field names through the generated mapping.
text fieldtext indexGenerate a LambdaDB text index for searchable full-text fields.
keyword, constant_keyword, wildcardkeyword indexGenerate LambdaDB keyword indexes for exact filters, IDs, categories, tags, and sort keys.
Numeric fieldslong or double indexMap integer-like fields to long and floating-point fields to double.
date, date_nanosdatetime indexGenerate a LambdaDB datetime index. Review source values before writing because the CLI does not rewrite date formats.
boolean fieldboolean indexCopy JSON booleans directly.
dense_vector fieldvector indexGenerate an unmanaged LambdaDB vector field with mapped dimensions and similarity. Dense vectors are fetched explicitly with Elasticsearch search fields.
nested fieldobject indexStore nested values as JSON payload fields and warn that nested query semantics require application-query review.

What still needs review

Review these features before cutover because the CLI migrates data and generated LambdaDB index configs, not Elasticsearch runtime behavior:
  • Elasticsearch Query DSL, aggregations, scripts, runtime fields, scoring scripts, and rescoring logic
  • custom analyzers, token filters, synonyms, normalizers, and language-specific index settings
  • nested query semantics, parent-child relationships, join fields, and field collapsing
  • index templates, aliases, data streams, ILM policies, and ingest pipelines
  • semantic_text, ELSER, model inference pipelines, and other Elasticsearch-managed semantic features
  • application code that expects Elasticsearch response shapes, shard metadata, highlights, or aggregations
For search applications, first decide which production queries must move to LambdaDB queryString, knn, sparseVector, bool, or hybrid queries. Then review the generated LambdaDB mapping around those queries instead of copying every Elasticsearch mapping field mechanically.

Step 1: Set credentials

LambdaDB Cloud uses region-specific API base URLs. Use the base URL, project name, and project API key shown for your project in the LambdaDB Cloud console. Do not assume a global default URL or a fixed project name.
Set your Elasticsearch API key and LambdaDB connection values:
export ELASTIC_API_KEY="YOUR_ELASTICSEARCH_API_KEY"
export LAMBDADB_BASE_URL="YOUR_REGION_BASE_URL"
export LAMBDADB_PROJECT_NAME="YOUR_PROJECT_NAME"
export LAMBDADB_PROJECT_API_KEY="YOUR_PROJECT_API_KEY"
You can also pass Elasticsearch basic auth credentials with --elasticsearch.username and --elasticsearch.password.

Step 2: Generate inventory and mapping

Run the inventory command against the Elasticsearch endpoint and index:
lambdadb-migration inventory elasticsearch \
  --elasticsearch.url https://your-elasticsearch-endpoint \
  --elasticsearch.index articles \
  --output elasticsearch-inventory.yaml
The output includes the source inventory and an editable LambdaDB mapping:
inventory:
  sourceKind: elasticsearch
  collectionName: articles
  recordCount: 250000
  vectors:
    embedding:
      name: embedding
      dimensions: 768
      similarity: cosine
  payloadIndexes:
    title:
      name: title
      type: text
    metadata.source:
      name: metadata.source
      type: keyword
mapping:
  target:
    collection: articles
    createCollection: true
  vectors:
    embedding:
      targetField: embedding
      dimensions: 768
      similarity: cosine
  payload:
    mode: flatten
    rename:
      metadata.source: metadata_source
    indexConfigs:
      title:
        type: text
      metadata_source:
        type: keyword
  ids:
    targetField: id
Review warnings in the inventory output. Common warnings include unsupported mapping types, Elasticsearch multi-fields that are not copied as separate source fields, nested fields that need query rewrite review, and PIT checkpoint expiry.

Step 3: Review the mapping

Review the generated mapping before migration:
  • confirm the target LambdaDB collection name
  • remove payload index configs for fields you only need to store
  • check generated renames for dotted fields such as metadata.source
  • confirm dense_vector dimensions and similarity
  • confirm date and date_nanos values in _source are compatible with LambdaDB datetime fields, or edit the mapping before creating the collection
  • add explicit field decisions for Elasticsearch multi-fields such as title.keyword if your application depends on exact-match behavior
The CLI maps Elasticsearch l2_norm vector similarity to LambdaDB euclidean. Elasticsearch cosine, dot_product, and max_inner_product map directly.
If you want to fetch vector fields explicitly instead of relying on mapping discovery, pass a comma-separated list:
--elasticsearch.vector-fields embedding,title_embedding

Step 4: Run a dry run

Run a dry run to validate the source inventory and mapping without writing documents:
lambdadb-migration elasticsearch \
  --elasticsearch.url https://your-elasticsearch-endpoint \
  --elasticsearch.index articles \
  --lambdadb.base-url "$LAMBDADB_BASE_URL" \
  --lambdadb.project-name "$LAMBDADB_PROJECT_NAME" \
  --lambdadb.api-key "$LAMBDADB_PROJECT_API_KEY" \
  --lambdadb.collection articles \
  --mapping-file elasticsearch-inventory.yaml \
  --migration.dry-run

Step 5: Run the migration

Run the migration with validation enabled:
lambdadb-migration elasticsearch \
  --elasticsearch.url https://your-elasticsearch-endpoint \
  --elasticsearch.index articles \
  --lambdadb.base-url "$LAMBDADB_BASE_URL" \
  --lambdadb.project-name "$LAMBDADB_PROJECT_NAME" \
  --lambdadb.api-key "$LAMBDADB_PROJECT_API_KEY" \
  --lambdadb.collection articles \
  --mapping-file elasticsearch-inventory.yaml \
  --migration.write-mode bulk \
  --migration.validate \
  --migration.validation-report validation-report.json
The Elasticsearch connector reads documents with a point in time (PIT), pages with search_after, and sorts by _shard_doc. It stores the latest PIT ID and search_after values in the local checkpoint.
Elasticsearch PIT IDs can expire. If a resumed migration fails because the saved PIT is no longer valid, rerun with --migration.restart.

Step 6: Validate results

Validation compares the accepted record count against the Elasticsearch inventory count, fetches sampled migrated documents from LambdaDB with strongly consistent reads, and compares sampled fields. Use a validation report for review:
--migration.validate \
--migration.validation-report validation-report.json
--migration.query-overlap currently supports Qdrant and Pinecone sources, not Elasticsearch. For Elasticsearch migrations, use count validation, sampled document validation, and manual review of representative LambdaDB search queries.
Example LambdaDB lexical query:
JSON
{
  "queryString": {
    "query": "serverless database",
    "defaultField": "body"
  }
}
Example LambdaDB hybrid query:
JSON
{
  "rrf": [
    {
      "queryString": {
        "query": "serverless database",
        "defaultField": "body"
      }
    },
    {
      "knn": {
        "field": "embedding",
        "queryVector": [0.1, 0.2, 0.3],
        "k": 10
      }
    }
  ]
}

Step 7: Cut over safely

Run the LambdaDB path in parallel before replacing production Elasticsearch traffic:
  1. Backfill historical documents.
  2. Replay writes that happened during the backfill window.
  3. Dual-write new updates to Elasticsearch and LambdaDB for a short verification period.
  4. Compare representative query results and latency.
  5. Switch read traffic to LambdaDB.
  6. Keep Elasticsearch available until rollback is no longer needed.

Elasticsearch references

Migration CLI

Learn the shared migration workflow, validation behavior, and checkpoint behavior.

Create a collection

Define the LambdaDB index configuration for migrated fields.

Query string search

Rewrite Elasticsearch lexical queries to LambdaDB query string syntax.

Hybrid query

Combine lexical and vector search after migration.