Migrate from Elasticsearch - LambdaDB Documentation

Use the LambdaDB Migration CLI to migrate an Elasticsearch index into LambdaDB. The CLI inventories the Elasticsearch mapping, generates an editable LambdaDB mapping, creates the target LambdaDB collection when needed, reads documents with Elasticsearch point-in-time pagination, writes LambdaDB documents, saves local checkpoints, and validates migrated records before cutover.

Elasticsearch support is available in LambdaDB Migration CLI v0.1.5 and later. If your installed CLI does not show lambdadb-migration elasticsearch --help, install a newer release before following this guide.

This guide assumes you are migrating a concrete Elasticsearch index whose source documents can be returned from _source. If you use aliases, data streams, or wildcard patterns that resolve to multiple backing indexes, run one migration per concrete index or prepare a custom consolidation plan. Elasticsearch stores records as JSON documents inside indexes. Search behavior is controlled by mappings, analyzers, dense vector fields, query DSL, ingest pipelines, aliases, and cluster-level settings. LambdaDB stores each migrated record as a document and indexes only the fields declared in the collection’s indexConfigs.

What the CLI supports

Elasticsearch source	LambdaDB target	Migration behavior
Index	Collection	Migrate one Elasticsearch index into one LambdaDB collection.
`_id`	Document `id`	Copy Elasticsearch `_id` as a LambdaDB string document ID.
`_source` fields	Document fields	Copy fields returned in `_source`. Fields removed from `_source`, or indexes with `_source` disabled, cannot be reconstructed by the default migration path. Nested objects are flattened into dot-path payload fields, then normalized for LambdaDB field names through the generated mapping.
`text` field	`text` index	Generate a LambdaDB text index for searchable full-text fields. Review analyzer choices before cutover because Elasticsearch analyzers are not copied automatically.
`keyword`, `constant_keyword`, `wildcard`	`keyword` index	Generate LambdaDB keyword indexes for exact filters, IDs, categories, tags, and sort keys.
Numeric fields	`long` or `double` index	Map integer-like fields to `long` and floating-point fields to `double`.
`date`, `date_nanos`	`datetime` index	Generate a LambdaDB datetime index. Review source values before writing because the CLI does not rewrite date formats.
`boolean` field	`boolean` index	Copy JSON booleans directly.
`dense_vector` field	`vector` index	Generate an unmanaged LambdaDB vector field with mapped dimensions and similarity. Dense vectors are fetched explicitly with Elasticsearch search `fields`.
`nested` field	`object` index	Store nested values as JSON payload fields and warn that nested query semantics require application-query review.

What still needs review

Review these features before cutover because the CLI migrates data and generated LambdaDB index configs, not Elasticsearch runtime behavior:

Elasticsearch Query DSL, aggregations, scripts, runtime fields, scoring scripts, and rescoring logic
custom analyzers, token filters, synonyms, normalizers, and language-specific index settings
nested query semantics, parent-child relationships, join fields, and field collapsing
index templates, aliases, data streams, ILM policies, and ingest pipelines
semantic_text, ELSER, model inference pipelines, and other Elasticsearch-managed semantic features
_source exclusions, disabled _source, synthetic _source differences, and stored-field-only designs
application code that expects Elasticsearch response shapes, shard metadata, highlights, or aggregations

For search applications, first decide which production queries must move to LambdaDB queryString, knn, sparseVector, bool, or hybrid queries. Then review the generated LambdaDB mapping around those queries instead of copying every Elasticsearch mapping field mechanically.

Step 1: Set credentials

LambdaDB Cloud uses region-specific API base URLs. Use the base URL, project name, and project API key shown for your project in the LambdaDB Cloud console. Do not assume a global default URL or a fixed project name.

Set your Elasticsearch API key and LambdaDB connection values:

export ELASTIC_API_KEY="YOUR_ELASTICSEARCH_API_KEY"
export LAMBDADB_BASE_URL="YOUR_REGION_BASE_URL"
export LAMBDADB_PROJECT_NAME="YOUR_PROJECT_NAME"
export LAMBDADB_PROJECT_API_KEY="YOUR_PROJECT_API_KEY"

You can also pass Elasticsearch basic auth credentials with --elasticsearch.username and --elasticsearch.password.

Step 2: Generate inventory and mapping

Run the inventory command against the Elasticsearch endpoint and index:

lambdadb-migration inventory elasticsearch \
  --elasticsearch.url https://your-elasticsearch-endpoint \
  --elasticsearch.index articles \
  --output elasticsearch-inventory.yaml

The output includes the source inventory and an editable LambdaDB mapping:

inventory:
  sourceKind: elasticsearch
  collectionName: articles
  recordCount: 250000
  vectors:
    embedding:
      name: embedding
      dimensions: 768
      similarity: cosine
  payloadIndexes:
    title:
      name: title
      type: text
    metadata.source:
      name: metadata.source
      type: keyword
mapping:
  target:
    collection: articles
    createCollection: true
  vectors:
    embedding:
      targetField: embedding
      dimensions: 768
      similarity: cosine
  payload:
    mode: flatten
    rename:
      metadata.source: metadata_source
    indexConfigs:
      title:
        type: text
      metadata_source:
        type: keyword
  ids:
    targetField: id

Review warnings in the inventory output. Common warnings include unsupported mapping types, Elasticsearch multi-fields that are not copied as separate source fields, nested fields that need query rewrite review, and PIT checkpoint expiry.

Use a concrete index name for inventory and migration. If an alias or wildcard resolves to multiple indexes, the CLI can only generate one LambdaDB mapping from the mapping response, while the source count and PIT read may cover more data than that single generated mapping represents.

Step 3: Review the mapping

Review the generated mapping before migration:

confirm the target LambdaDB collection name
remove payload index configs for fields you only need to store
check generated renames for dotted fields such as metadata.source
confirm dense_vector dimensions and similarity
add LambdaDB text analyzers such as english, korean, or japanese when your Elasticsearch workload depends on language-specific analysis
confirm date and date_nanos values in _source are compatible with LambdaDB datetime fields, or edit the mapping before creating the collection
add explicit field decisions for Elasticsearch multi-fields such as title.keyword if your application depends on exact-match behavior

The CLI maps Elasticsearch l2_norm vector similarity to LambdaDB euclidean. Elasticsearch cosine, dot_product, and max_inner_product map directly.

If you want to fetch vector fields explicitly instead of relying on mapping discovery, pass a comma-separated list:

--elasticsearch.vector-fields embedding,title_embedding

Step 4: Run a dry run

Run a dry run to validate the source inventory and mapping without writing documents:

lambdadb-migration elasticsearch \
  --elasticsearch.url https://your-elasticsearch-endpoint \
  --elasticsearch.index articles \
  --lambdadb.base-url "$LAMBDADB_BASE_URL" \
  --lambdadb.project-name "$LAMBDADB_PROJECT_NAME" \
  --lambdadb.api-key "$LAMBDADB_PROJECT_API_KEY" \
  --lambdadb.collection articles \
  --mapping-file elasticsearch-inventory.yaml \
  --migration.dry-run

Step 5: Run the migration

Run the migration with validation enabled:

lambdadb-migration elasticsearch \
  --elasticsearch.url https://your-elasticsearch-endpoint \
  --elasticsearch.index articles \
  --lambdadb.base-url "$LAMBDADB_BASE_URL" \
  --lambdadb.project-name "$LAMBDADB_PROJECT_NAME" \
  --lambdadb.api-key "$LAMBDADB_PROJECT_API_KEY" \
  --lambdadb.collection articles \
  --mapping-file elasticsearch-inventory.yaml \
  --migration.write-mode bulk \
  --migration.validate \
  --migration.validation-report validation-report.json

The Elasticsearch connector reads documents with a point in time (PIT), pages with search_after, and sorts by _shard_doc. It stores the latest PIT ID and search_after values in the local checkpoint. For large indexes or slow write targets, increase the PIT keep-alive window:

--elasticsearch.pit-keep-alive 15m

Elasticsearch PIT IDs can expire. If a resumed migration fails because the saved PIT is no longer valid, rerun with --migration.restart.

Step 6: Validate results

Validation compares the accepted record count against the Elasticsearch inventory count, fetches sampled migrated documents from LambdaDB with strongly consistent reads, and compares sampled fields. Use a validation report for review:

--migration.validate \
--migration.validation-report validation-report.json

--migration.query-overlap currently supports Qdrant and Pinecone sources, not Elasticsearch. For Elasticsearch migrations, use count validation, sampled document validation, and manual review of representative LambdaDB search queries.

Example LambdaDB lexical query:

JSON

{
  "queryString": {
    "query": "serverless database",
    "defaultField": "body"
  }
}

Example LambdaDB hybrid query:

JSON

{
  "rrf": [
    {
      "queryString": {
        "query": "serverless database",
        "defaultField": "body"
      }
    },
    {
      "knn": {
        "field": "embedding",
        "queryVector": [0.1, 0.2, 0.3],
        "k": 10
      }
    }
  ]
}

Step 7: Cut over safely

Run the LambdaDB path in parallel before replacing production Elasticsearch traffic:

Backfill historical documents.
Replay writes that happened during the backfill window.
Dual-write new updates to Elasticsearch and LambdaDB for a short verification period.
Compare representative query results and latency.
Switch read traffic to LambdaDB.
Keep Elasticsearch available until rollback is no longer needed.

Elasticsearch references

Migration CLI

Learn the shared migration workflow, validation behavior, and checkpoint behavior.

Create a collection

Define the LambdaDB index configuration for migrated fields.

Query string search

Rewrite Elasticsearch lexical queries to LambdaDB query string syntax.

Hybrid query

Combine lexical and vector search after migration.

​What the CLI supports

​What still needs review

​Step 1: Set credentials

​Step 2: Generate inventory and mapping

​Step 3: Review the mapping

​Step 4: Run a dry run

​Step 5: Run the migration

​Step 6: Validate results

​Step 7: Cut over safely

​Elasticsearch references

​Related docs

Migration CLI

Create a collection

Query string search

Hybrid query

What the CLI supports

What still needs review

Step 1: Set credentials

Step 2: Generate inventory and mapping

Step 3: Review the mapping

Step 4: Run a dry run

Step 5: Run the migration

Step 6: Validate results

Step 7: Cut over safely

Elasticsearch references

Related docs