Skip to main content
This guide will walk you through getting a project API key, installing the SDK, creating your first collection, and running hybrid search queries. You’ll learn to set up collections with text, keyword, and dense vector components, then execute both full-text and hybrid searches that combine traditional search with modern vector similarity.

🔑 Step 1: Get your API key

You’ll use a project API key from LambdaDB Cloud starting in Step 3.
LambdaDB Cloud is in public preview.
  1. Create a Cloud account — Open app.lambdadb.ai, sign up if you need an account, and sign in.
  2. Playground project (default) — New accounts include a Playground project so you can try the product immediately. It is a shared environment: the same project is visible to other users, and data you index there can be read by others. Use it only for feature testing—not for private, sensitive, or production workloads.
  3. Add a payment method — When you add a payment method, the Playground project is removed from your account and you can create your own projects. Project creation, billing, and the rest of the console capabilities are enabled once a payment method is on file. LambdaDB uses a serverless-native architecture: it is built from serverless components only (for example AWS Lambda and Amazon S3), so there is no monthly minimum fee—you pay for measured usage only. For why we built it this way, see the blog post “Serverless” Database Is Dead — It’s Time to Evolve. For how usage is metered and current rates, see Understanding costs.
  4. Create a project and save your key — When you create a project, choose an AWS region that fits latency and data-residency needs. After the project is created, your API key is shown only once—copy it immediately and store it somewhere safe (for example a password manager or a local environment variable). Use it with the base URL and project name from the console starting in Step 3.
Regional coverage: LambdaDB Cloud is available in all standard AWS commercial Regions—the same broad regional footprint AWS exposes for most services (30+ Regions in the public preview), rather than a short list that grows over time. Pick any supported Region shown in the console when you create a project.
If you are still on Playground only (before adding a payment method), use the API key and connection details shown in the console for the Playground project—the same Step 3 code paths apply once you substitute those values.
The Cloud console also supports loading data and running queries in the GUI, alongside the SDK examples in this guide.
Keep your API key out of source control and prefer environment variables instead of hardcoding it in scripts.
The Playground project is shared with other users and is subject to rate limits. Do not put sensitive or production data in Playground.

🚀 Step 2: Install the SDK

The LambdaDB SDK provides convenient access to the LambdaDB APIs.
pip install lambdadb
For Python, we recommend using a virtual environment to keep your dependencies organized and avoid conflicts between projects.

📚 Step 3: Create a collection

A collection is where you’ll store your documents and define how they should be indexed for search. LambdaDB supports 9 different index types: text, keyword, long, double, boolean, object, datetime, dense vector, and sparse vector. Let’s create a collection that combines text search with vector similarity:
from lambdadb import LambdaDB, models

# Initialize the LambdaDB client (replace with your base URL and project name, or omit to use defaults)
with LambdaDB(
    project_api_key="your_api_key_here",
    base_url="YOUR_BASE_URL",
    project_name="YOUR_PROJECT_NAME",
) as client:
    collection_name = "your_collection_name_here"
    res = client.collections.create(
        collection_name=collection_name,
        index_configs={
            "text": {
                "type": "text",
                "analyzers": ["english", "korean"],
            },
            "keyword": {"type": "keyword"},
            "vector": {
                "type": "vector",
                "dimensions": 10,
                "similarity": "cosine",
            },
        },
    )
    print(res)
Response:
CreateCollectionResponse(
    collection=CollectionResponse(
        project_name='playground',
        collection_name='quickstart',
        index_configs={
            'text': IndexConfigsText(
                type=<TypeText.TEXT: 'text'>,
                analyzers=[<Analyzer.ENGLISH: 'english'>, <Analyzer.KOREAN: 'korean'>]
            ),
            'keyword': IndexConfigs(type=<Type.KEYWORD: 'keyword'>),
            'vector': IndexConfigsVector(
                type=<TypeVector.VECTOR: 'vector'>,
                dimensions=10,
                similarity=<Similarity.COSINE: 'cosine'>
            ),
            'id': IndexConfigs(type=<Type.KEYWORD: 'keyword'>)
        },
        num_docs=0,
        collection_status=<Status.CREATING: 'CREATING'>,
        source_project_name=None,
        source_collection_name=None,
        source_collection_version_id=None
    )
)
Key configuration details:
  • Text field: Supports multilingual search with English and Korean analyzers
  • Vector field: 10-dimensional vectors using cosine similarity
  • Keyword field: Added to support exact match filtering.

📄 Step 4: Add documents

Now let’s add some sample documents. Each document contains text for full-text search, keywords for filtering, and vectors for similarity search:
with LambdaDB(project_api_key="your_api_key_here", base_url="YOUR_BASE_URL", project_name="YOUR_PROJECT_NAME") as client:
    coll = client.collection(collection_name)
    docs = [
        {"id": "doc1", "text": "Serverless computing does not mean no servers are involved. It refers to a cloud computing model where the server management is abstracted away from developers.", "keyword": "serverless", "vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]},
        {"id": "doc2", "text": "Instead, it refers to a cloud computing model where developers can build and run applications without having to manage the underlying infrastructure.", "keyword": "cloud", "vector": [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1]},
        {"id": "doc3", "text": "The key aspect is that developers don't need to explicitly provision or manage servers. The cloud provider handles all server management automatically.", "keyword": ["serverless", "infrastructure"], "vector": [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2]},
    ]
    res = coll.docs.upsert(docs=docs)
    print(res)
Response:
MessageResponse(message='Upsert request is accepted')
Important notes:
  • Upsert behavior: Documents with the same ID will be replaced; new IDs create new documents.
  • Auto-generated IDs: If you don’t provide an ID, one will be generated automatically.
  • Bulk operations: For large-scale document ingestion (5MB+), use the bulk-upsert functionality.
  • Configurable consistency: LambdaDB is eventually consistent by default, so there can be a slight delay before new or changed documents are visible to queries. If your application requires strong (read-after-write) consistency, set consistentRead (or consistent_read in Python) to true when you query or fetch data from a collection.
Check indexing status: You can view collection stats to verify that your documents have been indexed:
res = client.collection(collection_name).get()  # or client.collections.get(collection_name=collection_name)
print(res)
Let’s search for documents that match “I hate managing servers” while filtering for documents tagged exactly with “serverless”. This demonstrates LambdaDB’s powerful query capabilities:
with LambdaDB(project_api_key="your_api_key_here", base_url="YOUR_BASE_URL", project_name="YOUR_PROJECT_NAME") as client:
    coll = client.collection(collection_name)
    res = coll.query(
        size=10,
        query={
            "bool": [
                {"queryString": {"query": "I hate managing servers", "defaultField": "text"}},
                {"queryString": {"query": "keyword:serverless"}, "occur": "must"},
            ]
        },
        consistent_read=True,
    )
    print("🔍 Search Results:")
    # Query responses expose items in `res.docs` (each item includes `doc` and `score`).
    # Use `res.documents` if you want document bodies only (no scores).
    for item in res.docs:
        doc_id = str(item.doc.get("id"))
        score = f"{item.score:.2f}"
        keyword = str(item.doc.get("keyword"))
        text = str(item.doc.get("text", ""))[:60]
        print(f"{doc_id:<5} | {score:<5} | {keyword:<15} | {text}...")
Response:
🔍 Search Results:
doc3  | 0.83  | ['serverless', 'infrastructure'] | The key aspect is that developers don't need to explicitly provision or manage servers. The cloud provider handles all server management automatically.
doc1  | 0.80  | serverless      | Serverless computing does not mean no servers are involved. It refers to a cloud computing model where the server management is abstracted away from developers.
Why these results? doc3 scored highest because it directly mentions “manage servers”, while doc1 matched on “server management” and “serverless computing”. Now let’s combine full-text search with vector similarity for more comprehensive results. This is where LambdaDB really shines:
with LambdaDB(project_api_key="your_api_key_here", base_url="YOUR_BASE_URL", project_name="YOUR_PROJECT_NAME") as client:
    coll = client.collection(collection_name)
    res = coll.query(
        size=10,
        query={
            "l2": [
                {"queryString": {"query": "I hate managing servers", "defaultField": "text"}},
                {"knn": {"field": "vector", "k": 5, "queryVector": [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 1.0]}},
            ]
        },
        consistent_read=True,
    )
    print("🔄 Hybrid Search Results:")
    for item in res.docs:
        print(f"{item.doc.get('id')} | {item.score:.2f} | {item.doc.get('text', '')[:50]}...")
Response:
🔄 Hybrid Search Results:
doc3  | 0.66  | serverless,infrastructure | The key aspect is that developers don't need to explicitly provision or manage servers. The cloud provider handles all server management automatically.
doc1  | 0.62  | serverless      | Serverless computing does not mean no servers are involved. It refers to a cloud computing model where the server management is abstracted away from developers.
doc2  | 0.33  | cloud           | Instead, it refers to a cloud computing model where developers can build and run applications without having to manage the underlying infrastructure.
Score normalization options:
  • rrf (Reciprocal Rank Fusion): Great for combining rankings from different search methods
  • l2 (L2 Normalization): Normalizes scores using L2 norm
  • mm (Min-Max Normalization): Simple linear scaling to 0-1 range

🧹 Step 7: Clean up

When you’re finished experimenting, clean up your resources:
client.collections.delete(collection_name=collection_name)

🚀 Next steps

🤝 Support

Need help? Visit our Community Slack for support and discussions.