Hands-on: Mini RAG — Şirket El Kitabı Q&A Botu

Name: Hands-on: Mini RAG — Şirket El Kitabı Q&A Botu
Author: Şükrü Yusuf KAYA

Sıfırdan çalışan mini RAG: Markdown belgeden chunk → embed → Postgres pgvector → retrieval → Claude yanıt.

Şükrü Yusuf KAYA

18 min read

5/11/2026

Intermediate

Mini RAG Lab: Şirket El Kitabı Q&A

Adım 1: Setup#

bash

# Postgres + pgvector
docker run -d --name pg-rag \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  pgvector/pgvector:pg16
 
pip install psycopg2-binary openai anthropic langchain

Setup

Adım 2: Şema#

sql

CREATE EXTENSION IF NOT EXISTS vector;
 
CREATE TABLE handbook_chunks (
  id SERIAL PRIMARY KEY,
  source TEXT NOT NULL,
  section TEXT,
  content TEXT NOT NULL,
  embedding vector(1536) NOT NULL  -- OpenAI ada-3-small için 1536 dim
);
 
CREATE INDEX ON handbook_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Postgres + pgvector schema

Adım 3: Index Pipeline#

python

import os, psycopg2
from openai import OpenAI
from langchain.text_splitter import MarkdownHeaderTextSplitter
 
oai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
conn = psycopg2.connect("postgresql://postgres:secret@localhost:5432/postgres")
 
def embed(texts: list[str]) -> list[list[float]]:
    r = oai.embeddings.create(model="text-embedding-3-small", input=texts)
    return [d.embedding for d in r.data]
 
def index_handbook(md_path: str):
    with open(md_path) as f:
        md = f.read()
 
    splitter = MarkdownHeaderTextSplitter(
        headers_to_split_on=[("#", "h1"), ("##", "h2"), ("###", "h3")]
    )
    chunks = splitter.split_text(md)
 
    texts = [c.page_content for c in chunks]
    embs = embed(texts)
 
    cur = conn.cursor()
    for chunk, emb in zip(chunks, embs):
        cur.execute(
            "INSERT INTO handbook_chunks (source, section, content, embedding) VALUES (%s, %s, %s, %s)",
            (md_path, chunk.metadata.get("h2", ""), chunk.page_content, emb)
        )
    conn.commit()
 
index_handbook("handbook.md")

Indexing pipeline

Adım 4: Retrieval + Generation#

python

from anthropic import Anthropic
ant = Anthropic()
 
def retrieve(query: str, k: int = 5):
    [emb] = embed([query])
    cur = conn.cursor()
    cur.execute("""
        SELECT id, source, section, content,
               1 - (embedding <=> %s::vector) AS score
        FROM handbook_chunks
        ORDER BY embedding <=> %s::vector
        LIMIT %s
    """, (emb, emb, k))
    return cur.fetchall()
 
def answer(question: str) -> str:
    docs = retrieve(question, k=5)
    docs_xml = "\n".join(
        f'<document index="{i+1}">\n<source>{d[1]} - {d[2]}</source>\n<content>{d[3]}</content>\n</document>'
        for i, d in enumerate(docs)
    )
 
    system = """Aşağıdaki belgeleri kullanarak soruyu cevapla.
 
KURALLAR:
- Sadece <documents> içindeki bilgi
- Her cümle sonra [doc-X] citation
- Belgede yoksa "Bu bilgi el kitabında yer almıyor"
"""
 
    user = f"<documents>\n{docs_xml}\n</documents>\n\n<question>\n{question}\n</question>"
 
    r = ant.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1000,
        system=system,
        messages=[{"role": "user", "content": user}],
    )
    return r.content[0].text
 
print(answer("İzin günleri politikası nedir?"))

Retrieval + generation

Tebrikler! Bu 50 satır kodla production-ready RAG'in core'unu inşa ettin. Sonraki adımlar: hybrid retrieval, reranker, monitoring.

Yorumlar & Soru-Cevap

(0)

Yorum yazmak için giriş yap.

Yorumlar yükleniyor...

Pillar topics this article maps to

Pillar Topic

RAG (Retrieval-Augmented Generation) Architecture

RAG (Retrieval-Augmented Generation) is an architecture that grounds large-language-model answers in chunks retrieved from the organization's own documents or data sources, providing both freshness and citations.

Hands-on: Mini RAG — Şirket El Kitabı Q&A Botu

Mini RAG Lab: Şirket El Kitabı Q&A

Adım 1: Setup#

Adım 2: Şema#

Adım 3: Index Pipeline#

Adım 4: Retrieval + Generation#

Yorumlar & Soru-Cevap

Related Content

Bu Eğitim Hakkında ve Verimli Çalışma Yöntemi

Yapay Zekâ → Üretken AI → LLM: Bağlamsal Harita

LLM'ler Aslında Nasıl Düşünür? (Token, Embedding, Attention)

Pillar topics this article maps to

RAG (Retrieval-Augmented Generation) Architecture

Subscribe to Newsletter