Skip to content
Back to full roadmap
topiccore

Vector Memory (Semantic Retrieval)

Embed past interactions, retrieve by semantic similarity to the query.

3 hours1 prereqs

Instead of stuffing the entire conversation into context: embed each message (or N-message chunk), store in a vector DB. For each new query, retrieve top-K similar past interactions, inject into prompt.

Pro: unlimited "memory" — only relevant pieces enter context.

Con: semantic similarity ≠ relevance. Sometimes the wrong chunk comes back. Hybrid retrieval (BM25 + dense) + reranking mandatory.

Stack: pgvector (Postgres), Pinecone, Qdrant, Weaviate, Chroma.

Prerequisites