Utilizing LangChain with Qdrant for Advanced Vector Searching

Name: Lynn Mikami

Published on 4/30/2024

Introducing Qdrant: A Vector Similarity Search Engine for LangChain

In the world of language modeling and text-based applications, the ability to efficiently search and retrieve similar vectors is crucial. Whether it's finding relevant documents, matching user queries, or performing semantic-based matching, having a powerful and convenient vector similarity search engine can greatly enhance the capabilities of applications. That's where Qdrant comes in.

Qdrant is a vector similarity search engine that provides a production-ready service with a convenient API for storing, searching, and managing vectors with additional payload. It is designed to handle large-scale vector searches efficiently and offers extensive filtering capabilities for advanced matching needs. Qdrant is particularly useful for applications that require neural network or semantic-based matching, faceted search, and more.

Article Summary

Qdrant is a vector similarity search engine that provides a production-ready service with a convenient API for storing, searching, and managing vectors with additional payload.
It offers extensive filtering capabilities and is suitable for applications that require advanced matching needs.
Qdrant can be used as a retriever in LangChain for cosine similarity searches or Maximal Marginal Relevance (MMR) searches.

<AdComponent />

Installing Qdrant Client and Setting Up OpenAI API Key

To use Qdrant in LangChain, the first step is to install the qdrant-client package, which provides the necessary client libraries for interacting with Qdrant. You can install it using the following command:

%pip install --upgrade --quiet qdrant-client

Next, you need to set up the OpenAI API Key, which is required for performing similarity searches using the OpenAI embeddings. You can set up the API Key by using the getpass and os modules in Python:

import getpass
import os
 
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

Loading and Splitting Documents in LangChain

Once the Qdrant client is installed and the OpenAI API Key is set up, you can start loading and splitting your documents in LangChain. To do this, you can use the TextLoader class provided by the LangChain community module to load your documents.

from langchain_community.document_loaders import TextLoader
 
loader = TextLoader("/path/to/your/documents.txt")
documents = loader.load()

After loading the documents, you can split them into smaller chunks using the CharacterTextSplitter class provided by the LangChain text splitters module. This can be useful when dealing with large documents or when you want to perform searches on specific parts of the text.

from langchain_text_splitters import CharacterTextSplitter
 
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

Connecting to Qdrant in Different Modes

Once your documents are loaded and split, you can connect to Qdrant in different modes depending on your deployment needs. Qdrant supports multiple deployment options, including local mode, on-premise server deployment, and Qdrant Cloud.

Local Mode with In-Memory Storage

In local mode, you can run Qdrant without a server and keep the data in memory only. This mode is useful for quick experimentation and testing. To connect to Qdrant in local mode with in-memory storage, you can use the Qdrant.from_documents method and specify the location parameter as :memory:.

from langchain_community.vectorstores import Qdrant
from langchain_openai import OpenAIEmbeddings
 
embeddings = OpenAIEmbeddings()
 
qdrant = Qdrant.from_documents(docs, embeddings, location=":memory:", collection_name="my_documents")

Local Mode with Disk Storage

If you want to persist the data on disk in local mode, you can specify a path where the Qdrant data will be stored. This can be useful when dealing with larger datasets or when you need to preserve the data across sessions. To connect to Qdrant in local mode with disk storage, you can use the Qdrant.from_documents method and specify the path parameter.

qdrant = Qdrant.from_documents(docs, embeddings, path="/tmp/local_qdrant", collection_name="my_documents")

On-Premise Server Deployment

For larger-scale deployments, you can connect to a Qdrant instance running locally with a Docker container or a Kubernetes deployment. To connect to Qdrant in on-premise server deployment, you need to specify the URL of the Qdrant instance and set the prefer_grpc parameter to True for better performance.

url = "<---qdrant url here --->"
qdrant = Qdrant.from_documents(docs, embeddings, url=url, prefer_grpc=True, collection_name="my_documents")

Qdrant Cloud

If you prefer a fully-managed Qdrant cluster, you can set up a Qdrant Cloud account and connect to the cluster using the provided URL and API key. This option offers a scalable and secure solution for deploying Qdrant. To connect to Qdrant in Qdrant Cloud, you need to specify the URL and API key in the Qdrant.from_documents method.

url = "<---qdrant cloud cluster url here --->"
api_key = "<---api key here--->"
qdrant = Qdrant.from_documents(docs, embeddings, url=url, prefer_grpc=True, api_key=api_key, collection_name="my_documents")

In the next section, we will explore how to perform similarity searches on the Qdrant collection and retrieve similarity scores for the search results.

To be continued...

langchain Qdrant

Connecting to Qdrant in Different Modes

Qdrant provides different modes of connection depending on your requirements. Whether you want to run Qdrant locally or deploy it on-premise or in the cloud, there are options available to suit your needs.

Local Mode with In-Memory Storage

In local mode, you can run Qdrant without the need for a Qdrant server. This is useful for testing and debugging purposes or when you only need to store a small amount of vectors. In this mode, the embeddings are kept fully in memory and will be lost when the client is destroyed.

To connect to Qdrant in local mode with in-memory storage, you can use the following code:

from langchain.vectorstores import Qdrant
from langchain.embeddings import HuggingFaceEmbeddings
 
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
qdrant = Qdrant.from_documents(
    documents, embeddings, location=":memory:", collection_name="my_collection"
)

Local Mode with Disk Storage

If you prefer to persist the vectors on disk in local mode, you can specify a path where the vectors will be stored. This allows you to reuse the vectors between runs and avoid starting from scratch each time.

To connect to Qdrant in local mode with disk storage, you can use the following code:

from langchain.vectorstores import Qdrant
from langchain.embeddings import HuggingFaceEmbeddings
 
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
qdrant = Qdrant.from_documents(
    documents, embeddings, path="/path/to/storage", collection_name="my_collection"
)

On-Premise Server Deployment

If you choose to deploy Qdrant on-premise, either using a Docker container or a Kubernetes deployment with the official Helm chart, you will need to provide the URL of the Qdrant service.

To connect to Qdrant in on-premise server deployment, you can use the following code:

import qdrant_client
from langchain.vectorstores import Qdrant
from langchain.embeddings import HuggingFaceEmbeddings
 
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
client = qdrant_client.QdrantClient("<qdrant-url>", api_key="<qdrant-api-key>")
qdrant = Qdrant(client=client, collection_name="my_collection", embeddings=embeddings)

Qdrant Cloud

If you prefer to use Qdrant Cloud, you can connect to it by providing the appropriate URL and API key.

To connect to Qdrant in Qdrant Cloud, you can use the following code:

from langchain.vectorstores import Qdrant
from langchain.embeddings import HuggingFaceEmbeddings
 
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
qdrant = Qdrant.from_documents(
    documents, embeddings, url="<qdrant-url>", api_key="<qdrant-api-key>", collection_name="my_collection"
)

By using these different modes of connection, you can easily integrate Qdrant into your LangChain applications and take advantage of its powerful vector similarity search capabilities.

Performing Similarity Searches on Qdrant Collection

Once you have connected to a Qdrant collection, you can perform similarity searches on the vectors stored in the collection. This allows you to find vectors that are similar to a given query vector.

To perform a similarity search, you need to provide a query vector and specify the number of nearest neighbors you want to retrieve. Qdrant will return the nearest neighbors along with their similarity scores.

Here is an example of how to perform a similarity search using Qdrant:

# Query vector
query_vector = [0.1, 0.2, 0.3, 0.4, 0.5]
 
# Number of nearest neighbors to retrieve
k = 5
 
# Perform similarity search
results = qdrant.search(query_vector, k)

The results object will contain the nearest neighbors and their similarity scores. You can then use this information to further process and analyze the results.

Utilizing Qdrant's Extensive Filtering Capabilities

Qdrant provides extensive filtering capabilities that allow you to refine your search results based on specific criteria. You can filter the search results by specifying conditions on the vector attributes or the payload attributes associated with the vectors.

Here are some examples of how to use Qdrant's filtering capabilities in LangChain:

Filter by Vector Attributes

# Filter by a specific attribute value
results = qdrant.search(query_vector, k, filter={"vector_attributes": {"attribute_name": "attribute_value"}})
 
# Filter by a range of attribute values
results = qdrant.search(query_vector, k, filter={"vector_attributes": {"attribute_name": {"gte": 10, "lte": 20}}})

Filter by Payload Attributes

# Filter by a specific payload attribute value
results = qdrant.search(query_vector, k, filter={"payload_attributes": {"attribute_name": "attribute_value"}})
 
# Filter by a range of payload attribute values
results = qdrant.search(query_vector, k, filter={"payload_attributes": {"attribute_name": {"gte": 10, "lte": 20}}})

These filtering capabilities allow you to easily narrow down your search results and retrieve the vectors that meet your specific criteria.

Retrieving Diverse Results with Maximal Marginal Relevance (MMR) Search

In addition to performing similarity searches, Qdrant also provides the ability to retrieve diverse results using the Maximal Marginal Relevance (MMR) search method. MMR search aims to find a set of documents that are both relevant and diverse, providing a more comprehensive representation of the search space.

To perform an MMR search, you need to provide a query vector, the number of nearest neighbors to retrieve, and a diversity parameter that controls the trade-off between relevance and diversity.

Here is an example of how to perform an MMR search using Qdrant:

# Query vector
query_vector = [0.1, 0.2, 0.3, 0.4, 0.5]
 
# Number of nearest neighbors to retrieve
k = 5
 
# Diversity parameter
lambda_param = 0.5
 
# Perform MMR search
results = qdrant.mmr_search(query_vector, k, lambda_param)

The results object will contain the diverse set of nearest neighbors and their similarity scores. By adjusting the diversity parameter, you can control the balance between relevance and diversity in the search results.

Using Qdrant as a Retriever in LangChain

Qdrant can be used as a retriever in LangChain for both cosine similarity searches and MMR searches. By integrating Qdrant into your LangChain applications, you can leverage its powerful vector similarity search capabilities to enhance the retrieval performance and accuracy.

To use Qdrant as a retriever for cosine similarity searches, you can use the following code:

from langchain.retrievers import QdrantRetriever
 
retriever = QdrantRetriever(qdrant)
results = retriever.retrieve_cosine(query_vector, k)

To use Qdrant as a retriever for MMR searches, you can use the following code:

from langchain.retrievers import QdrantRetriever
 
retriever = QdrantRetriever(qdrant)
results = retriever.retrieve_mmr(query_vector, k, lambda_param)

By utilizing Qdrant as a retriever in LangChain, you can easily incorporate vector similarity search functionality into your language model-based applications.

Conclusion

In this article, we have explored how to connect to Qdrant in different modes, perform similarity searches on Qdrant collections, utilize Qdrant's extensive filtering capabilities, retrieve diverse results using MMR search, and use Qdrant as a retriever in LangChain. By integrating Qdrant into your LangChain applications, you can leverage its powerful vector similarity search engine to enhance the retrieval performance and accuracy.

Using Prompt Templates in LangChain: A Detailed Guide for Generating Language Model Prompts Enhance NLP Applications with Langchain Sentence Transformers