In our previous post, we explored reranking techniques to refine the results obtained from an initial search. Today, weโre taking a significant step forward by combining HNSW for efficient retrieval and reranking for precision. Weโre building a complete retrieval system, from data preparation to final result presentation.
The Big Picture: A Two-Stage Pipeline
Our system operates in two distinct stages:
- Retrieval (HNSW): This stage quickly identifies a set of candidate documents based on a query vector. Think of it as casting a wide net โ we want to retrieve a manageable number of potentially relevant documents.
- Reranking: This stage re-evaluates the candidate documents from the retrieval stage, using a more sophisticated model to determine the most relevant results. Think of this as carefully examining the candidates from the net to select the best fit.
1. Data Preparation
Letโs assume we have a dataset of product descriptions. We need to convert these descriptions into numerical vectors. Weโre using Sentence Transformers for this purpose.
from sentence_transformers import SentenceTransformer
import numpy as np
# Load a pre-trained Sentence Transformer model
model = SentenceTransformer('all-mpnet-base-v2')
# Sample product descriptions
descriptions = [
"High-quality leather wallet with multiple card slots.",
"Stylish cotton t-shirt for everyday wear.",
"Durable stainless steel water bottle with leak-proof lid.",
"Comfortable running shoes with excellent cushioning.",
"Elegant silk scarf with intricate floral pattern.",
"Robust backpack with multiple compartments."
]
# Convert descriptions to embeddings
embeddings = model.encode(descriptions)
print(f"Shape of embeddings: {embeddings.shape}") # Expected: (6, 768)
2. Building the HNSW Index
Now, weโre using nmslib to build an HNSW index on top of the generated embeddings.
import nmslib
from nmslib import IndexType
# Create an HNSW index
index = nmslib.init(IndexType.HNSW, space='cosine') # Using cosine similarity
# Add the embeddings to the index
index.add(np.arange(len(descriptions)), embeddings)
# Build the index (adjust m and ef_construction for performance)
index.build(m=16, ef_construction=200) # m is number of connections per node, ef_construction is construction time.
print(f"Index built with {len(descriptions)} documents.")
3. Querying the HNSW Index
Letโs say we have a query: โFind a durable bag for travel.โ We need to convert this query into a vector.
query = "Find a durable bag for travel."
query_embedding = model.encode(query)
# Search the index (adjust ef for search time vs. accuracy)
N = 3 # Retrieve top 3 candidates
results = index.query(query_embedding, N)
print(f"Query results: {results}")
# Print the retrieved descriptions
for i in results[0]:
print(f"Document {i}: {descriptions[i]}")
4. Implementing a Simple Reranking Model
For simplicity, weโre using cosine similarity as our reranking model. Weโre comparing the query vector to the retrieved candidates.
def rerank(query_embedding, candidates, descriptions):
"""Reranks a list of candidates using cosine similarity."""
scores = []
for i in candidates:
score = np.dot(query_embedding, descriptions[i]) # Cosine similarity
scores.append(score)
# Sort candidates by score
ranked_candidates = np.argsort(scores)[::-1]
return ranked_candidates
# Rerank the retrieved candidates
ranked_candidates = rerank(query_embedding, results[0], descriptions)
print("Reranked candidates:")
for i in ranked_candidates:
print(f"Document {i}: {descriptions[i]}")
5. Combining HNSW and Reranking: The Complete Pipeline
Letโs encapsulate the entire process into a function.
def retrieve_and_rerank(query, model, index, descriptions):
"""Retrieves documents using HNSW and reranks them."""
query_embedding = model.encode(query)
N = 5 # Retrieve top 3 candidates
# HNSW Retrieval
results = index.query(query_embedding, N)
# Reranking
ranked_candidates = rerank(query_embedding, results[0], descriptions)
return ranked_candidates
# Example usage
ranked_candidates = retrieve_and_rerank("Find a durable bag for travel.", model, index, descriptions)
print("Final ranked candidates:")
for i in ranked_candidates:
print(f"Document {i}: {descriptions[i]}")
6. A More Sophisticated Reranking Model (Cross-Encoder)
While cosine similarity is simple, it doesnโt capture the nuances of language. Letโs use a cross-encoder model for a more accurate reranking. Weโre using sentence-transformers again, but this time with a cross-encoder.
from sentence_transformers import CrossEncoder
# Load a cross-encoder model
cross_encoder = CrossEncoder('cross-encoder/ms-marco-TinyBERT-v2')
def rerank_cross_encoder(query, candidates, descriptions, cross_encoder):
"""Reranks candidates using a cross-encoder model."""
scores = []
for i in candidates:
text = f"Query: {query} Document: {descriptions[i]}"
score = cross_encoder.encode(text)[0] # Get the relevance score
scores.append(score)
ranked_candidates = np.argsort(scores)[::-1]
return ranked_candidates
# Example usage
N = 5
results = index.query(query_embedding, N)
ranked_candidates = rerank_cross_encoder(query, results[0], descriptions, cross_encoder)
print("Final ranked candidates (cross-encoder):")
for i in ranked_candidates:
print(f"Document {i}: {descriptions[i]}")
7. Performance Considerations and Tuning
- HNSW Parameters (m, ef_construction, ef): Experiment with different values to balance index build time and search accuracy.
- Reranking Model: Choose a model that aligns with your specific use case. Consider factors like model size, accuracy, and inference time.
- Hybrid Approach: Combine HNSW with other retrieval methods for improved performance.
- Caching: Cache frequently used embeddings and search results to reduce latency.
Key Takeaways
- Combining HNSW for efficient retrieval with a sophisticated reranking model significantly improves the quality of search results.
- Careful tuning of HNSW parameters and selection of an appropriate reranking model are crucial for optimal performance.
- This two-stage pipeline offers a flexible and scalable solution for a wide range of retrieval tasks.
- Choosing the right model and understanding its trade-offs is essential for achieving the best results.
This complete retrieval system demonstrates how to leverage the power of HNSW and reranking to build a high-performance search application. By combining efficient retrieval with accurate relevance scoring, you can deliver a superior user experience and unlock the full potential of your data.
Discover more from A Streak of Communication
Subscribe to get the latest posts sent to your email.