Build a Retrieval-Augmented Generation pipeline with hybrid search (vector + keyword) and a reranking step for higher precision answers.
## Task RAG pipeline with hybrid search and reranking for high-precision Q&A. ## Requirements - Vector DB: pgvector, Pinecone, or Qdrant - Embeddings: OpenAI text-embedding-3-small or Cohere embed-v3 - Reranker: Cohere rerank or cross-encoder model - Language: Python or TypeScript ## Pipeline ``` Query → [Hybrid Search] → [Rerank] → [LLM Generate] 1. Hybrid Search (parallel): a. Vector search: embed query → top 20 by cosine similarity b. Keyword search: BM25/FTS on same corpus → top 20 c. Merge results using Reciprocal Rank Fusion (RRF) 2. Rerank: - Take merged top 30 results - Rerank with cross-encoder (query, document) pairs - Keep top 5 3. Generate: - Inject top 5 chunks as context - System prompt: "Answer based only on provided context" - Include source citations ``` ## Implementation Notes 1. Chunk documents at 512 tokens with 50-token overlap 2. Store metadata: source URL, title, chunk index 3. Cache embeddings — don't re-embed on every query 4. Include "I don't have enough information" when context is insufficient 5. Return confidence score based on reranker scores
Run this prompt directly on Promptibus
No API keys needed — we handle everything.
No gallery images yet.