Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.lambdadb.ai/llms.txt

Use this file to discover all available pages before exploring further.

This guide will walk you through getting a project API key, installing the SDK, creating your first collection, and running hybrid search queries. You’ll learn to set up collections with text, keyword, and dense vector components, then execute both full-text and hybrid searches that combine traditional search with modern vector similarity.

πŸ”‘ Step 1: Get your API key

You’ll use a project API key from LambdaDB Cloud starting in Step 3.
LambdaDB Cloud is in public preview.
  1. Sign in to LambdaDB Cloud β€” Open app.lambdadb.ai, sign up if needed, and sign in.
  2. Create a project β€” Accounts without a payment method are on the Free plan. Choose an AWS region that fits your latency or data-residency needs.
  3. Copy your project API key β€” The API key is shown only once after the project is created. Store it somewhere safe, then use it with the base URL and project name from the console starting in Step 3.
The Free plan includes monthly read, write, and storage usage at no cost. Add a payment method when you need Standard plan usage beyond the Free plan limits. See Understanding costs.
The Cloud console also supports loading data and running queries in the GUI, alongside the SDK examples in this guide.
Keep your API key out of source control and prefer environment variables instead of hardcoding it in scripts.

πŸš€ Step 2: Install the SDK

The LambdaDB SDK provides convenient access to the LambdaDB APIs.
pip install lambdadb
For Python, we recommend using a virtual environment to keep your dependencies organized and avoid conflicts between projects.

πŸ“š Step 3: Create a collection

A collection is where you’ll store your documents and define how they should be indexed for search. LambdaDB supports 9 different index types: text, keyword, long, double, boolean, object, datetime, dense vector, and sparse vector. Let’s create a collection that combines text search with vector similarity:
from lambdadb import LambdaDB, models

# Initialize the LambdaDB client (replace with your base URL and project name, or omit to use defaults)
with LambdaDB(
    project_api_key="your_api_key_here",
    base_url="YOUR_BASE_URL",
    project_name="YOUR_PROJECT_NAME",
) as client:
    collection_name = "your_collection_name_here"
    res = client.collections.create(
        collection_name=collection_name,
        index_configs={
            "text": {
                "type": "text",
                "analyzers": ["english", "korean"],
            },
            "keyword": {"type": "keyword"},
            "vector": {
                "type": "vector",
                "dimensions": 10,
                "similarity": "cosine",
            },
        },
    )
    print(res)
Response:
CreateCollectionResponse(
    collection=CollectionResponse(
        project_name='your-project',
        collection_name='quickstart',
        index_configs={
            'text': IndexConfigsText(
                type=<TypeText.TEXT: 'text'>,
                analyzers=[<Analyzer.ENGLISH: 'english'>, <Analyzer.KOREAN: 'korean'>]
            ),
            'keyword': IndexConfigs(type=<Type.KEYWORD: 'keyword'>),
            'vector': IndexConfigsVector(
                type=<TypeVector.VECTOR: 'vector'>,
                dimensions=10,
                similarity=<Similarity.COSINE: 'cosine'>
            ),
            'id': IndexConfigs(type=<Type.KEYWORD: 'keyword'>)
        },
        num_docs=0,
        collection_status=<Status.CREATING: 'CREATING'>,
        source_project_name=None,
        source_collection_name=None,
        source_collection_version_id=None
    )
)
Key configuration details:
  • Text field: Supports multilingual search with English and Korean analyzers.
  • Vector field: 10-dimensional vectors using cosine similarity.
  • Keyword field: Added to support exact match filtering.

πŸ“„ Step 4: Add documents

Now let’s add some sample documents. Each document contains text for full-text search, keywords for filtering, and vectors for similarity search:
with LambdaDB(project_api_key="your_api_key_here", base_url="YOUR_BASE_URL", project_name="YOUR_PROJECT_NAME") as client:
    coll = client.collection(collection_name)
    docs = [
        {"id": "doc1", "text": "Serverless computing does not mean no servers are involved. It refers to a cloud computing model where the server management is abstracted away from developers.", "keyword": "serverless", "vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]},
        {"id": "doc2", "text": "Instead, it refers to a cloud computing model where developers can build and run applications without having to manage the underlying infrastructure.", "keyword": "cloud", "vector": [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]},
        {"id": "doc3", "text": "The key aspect is that developers don't need to explicitly provision or manage servers. The cloud provider handles all server management automatically.", "keyword": ["serverless", "infrastructure"], "vector": [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2]},
    ]
    res = coll.docs.upsert(docs=docs)
    print(res)
Response:
MessageResponse(message='Upsert request is accepted')
Important notes:
  • Upsert behavior: Documents with the same ID will be replaced; new IDs create new documents.
  • Auto-generated IDs: If you don’t provide an ID, one will be generated automatically.
  • Bulk operations: For large-scale document ingestion (5MB+), use the bulk-upsert functionality.
  • Configurable consistency: LambdaDB is eventually consistent by default, so there can be a slight delay before new or changed documents are visible to queries. If your application requires strong (read-after-write) consistency, set consistentRead (or consistent_read in Python) to true when you query or fetch data from a collection.
Check indexing status: You can view collection stats to verify that your documents have been indexed:
res = client.collection(collection_name).get()  # or client.collections.get(collection_name=collection_name)
print(res)
Let’s search for documents that match β€œI hate managing servers” while filtering for documents tagged exactly with β€œserverless”. This demonstrates LambdaDB’s powerful query capabilities:
with LambdaDB(project_api_key="your_api_key_here", base_url="YOUR_BASE_URL", project_name="YOUR_PROJECT_NAME") as client:
    coll = client.collection(collection_name)
    res = coll.query(
        size=10,
        query={
            "bool": [
                {"queryString": {"query": "I hate managing servers", "defaultField": "text"}},
                {"queryString": {"query": "keyword:serverless"}, "occur": "must"},
            ]
        },
        consistent_read=True,
    )
    print("πŸ” Search Results:")
    # Query responses expose items in `res.docs` (each item includes `doc` and `score`).
    # Use `res.documents` if you want document bodies only (no scores).
    for item in res.docs:
        doc_id = str(item.doc.get("id"))
        score = f"{item.score:.2f}"
        keyword = str(item.doc.get("keyword"))
        text = str(item.doc.get("text", ""))[:60]
        print(f"{doc_id:<5} | {score:<5} | {keyword:<15} | {text}...")
Response:
πŸ” Search Results:
doc3  | 0.83  | ['serverless', 'infrastructure'] | The key aspect is that developers don't need to explicitly provision or manage servers. The cloud provider handles all server management automatically.
doc1  | 0.80  | serverless      | Serverless computing does not mean no servers are involved. It refers to a cloud computing model where the server management is abstracted away from developers.
Why these results? doc3 scored highest because it directly mentions β€œmanage servers”, while doc1 matched on β€œserver management” and β€œserverless computing”. Now let’s combine full-text search with vector similarity for more comprehensive results. This is where LambdaDB really shines:
with LambdaDB(project_api_key="your_api_key_here", base_url="YOUR_BASE_URL", project_name="YOUR_PROJECT_NAME") as client:
    coll = client.collection(collection_name)
    res = coll.query(
        size=10,
        query={
            "l2": [
                {"queryString": {"query": "I hate managing servers", "defaultField": "text"}},
                {"knn": {"field": "vector", "k": 5, "queryVector": [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 1.0]}},
            ]
        },
        consistent_read=True,
    )
    print("πŸ”„ Hybrid Search Results:")
    for item in res.docs:
        print(f"{item.doc.get('id')} | {item.score:.2f} | {item.doc.get('text', '')[:50]}...")
Response:
πŸ”„ Hybrid Search Results:
doc3  | 0.66  | serverless,infrastructure | The key aspect is that developers don't need to explicitly provision or manage servers. The cloud provider handles all server management automatically.
doc1  | 0.62  | serverless      | Serverless computing does not mean no servers are involved. It refers to a cloud computing model where the server management is abstracted away from developers.
doc2  | 0.33  | cloud           | Instead, it refers to a cloud computing model where developers can build and run applications without having to manage the underlying infrastructure.
Score normalization options:
  • rrf (Reciprocal Rank Fusion): Great for combining rankings from different search methods
  • l2 (L2 Normalization): Normalizes scores using L2 norm
  • mm (Min-Max Normalization): Simple linear scaling to 0-1 range

🧹 Step 7: Clean up

When you’re finished experimenting, clean up your resources:
client.collections.delete(collection_name=collection_name)

πŸš€ Next steps

🀝 Support

Need help? Visit our Community Slack for support and discussions.