Astra DB
DataStax Astra DB is a serverless AI-ready database built on
Apache Cassandra®
and made conveniently available through an easy-to-use JSON API.
See a tutorial provided by DataStax.
Installation and Setup
Install the following Python package:
pip install "langchain-astradb>=0.6,<0.7"
Create a database (if needed) and get the connection secrets. Set the following variables:
ASTRA_DB_API_ENDPOINT="API_ENDPOINT"
ASTRA_DB_APPLICATION_TOKEN="TOKEN"
Vector Store
A few typical initialization patterns are shown here:
from langchain_astradb import AstraDBVectorStore
vector_store = AstraDBVectorStore(
embedding=my_embedding,
collection_name="my_store",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
from astrapy.info import VectorServiceOptions
vector_store_vectorize = AstraDBVectorStore(
collection_name="my_vectorize_store",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_vector_service_options=VectorServiceOptions(
provider="nvidia",
model_name="NV-Embed-QA",
),
)
from astrapy.info import (
CollectionLexicalOptions,
CollectionRerankOptions,
RerankServiceOptions,
VectorServiceOptions,
)
vector_store_hybrid = AstraDBVectorStore(
collection_name="my_hybrid_store",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_vector_service_options=VectorServiceOptions(
provider="nvidia",
model_name="NV-Embed-QA",
),
collection_lexical=CollectionLexicalOptions(analyzer="standard"),
collection_rerank=CollectionRerankOptions(
service=RerankServiceOptions(
provider="nvidia",
model_name="nvidia/llama-3.2-nv-rerankqa-1b-v2",
),
),
)
Notable features of class AstraDBVectorStore
:
- native async API;
- metadata filtering in search;
- MMR (maximum marginal relevance) search;
- server-side embedding computation ("vectorize" in Astra DB parlance);
- auto-detect its settings from an existing, pre-populated Astra DB collection;
- hybrid search (vector + BM25 and then a rerank step);
- support for non-Astra Data API (e.g. self-hosted HCD deployments);
Learn more in the example notebook.
See the example provided by DataStax.
Chat message history
from langchain_astradb import AstraDBChatMessageHistory
message_history = AstraDBChatMessageHistory(
session_id="test-session",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
See the usage example.
LLM Cache
from langchain.globals import set_llm_cache
from langchain_astradb import AstraDBCache
set_llm_cache(AstraDBCache(
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
))
Learn more in the example notebook (scroll to the Astra DB section).
Semantic LLM Cache
from langchain.globals import set_llm_cache
from langchain_astradb import AstraDBSemanticCache
set_llm_cache(AstraDBSemanticCache(
embedding=my_embedding,
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
))
Learn more in the example notebook (scroll to the appropriate section).
Document loader
from langchain_astradb import AstraDBLoader
loader = AstraDBLoader(
collection_name="my_collection",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
Learn more in the example notebook.
Self-querying retriever
from langchain_astradb import AstraDBVectorStore
from langchain.retrievers.self_query.base import SelfQueryRetriever
vector_store = AstraDBVectorStore(
embedding=my_embedding,
collection_name="my_store",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
retriever = SelfQueryRetriever.from_llm(
my_llm,
vector_store,
document_content_description,
metadata_field_info
)
Learn more in the example notebook.
Store
from langchain_astradb import AstraDBStore
store = AstraDBStore(
collection_name="my_kv_store",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
See the API Reference for the AstraDBStore.
Byte Store
from langchain_astradb import AstraDBByteStore
store = AstraDBByteStore(
collection_name="my_kv_store",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
See the API reference for the AstraDBByteStore.