# AI Agent Memory Systems Engineering Training (Letta / MemGPT + Mem0 + Zep + Cognee + Graphiti + LangMem)

> Source: https://sukruyusufkaya.com/en/training/ai-agent-memory-sistemleri-muhendisligi-egitimi
> Updated: 2026-05-19T15:45:43.549Z
> Level: advanced
> Topics: ai agent memory, letta, memgpt, mem0, zep, graphiti, cognee, langmem, langgraph store, openai memory, claude memory, anthropic memory tool, episodic memory, semantic memory, procedural memory, temporal knowledge graph, memory consolidation, hybrid retrieval, locomo benchmark, kvkk uyumlu agent memory
**TLDR:** A 3-day advanced Turkish agent-memory training that solves the persistent-memory needs of stateful AI agents end to end. Includes Letta (formerly MemGPT), Mem0, Zep + Graphiti temporal knowledge graph, Cognee GraphRAG + memory hybrid, LangChain LangMem + LangGraph Store, OpenAI Memory + Claude Projects + Anthropic Memory Tool API, episodic/semantic/procedural memory taxonomy, vector + graph hybrid retrieval, consolidation + forgetting lifecycle, LoCoMo + LongMemEval benchmarks, and KVKK-compliant deployment.

## Açıklama

The AI Agent Memory Systems Engineering Training is a 3-day advanced program designed for AI Engineers, ML Engineers, Senior Backend Developers, and Agent Engineers who want to complete the transition from stateless LLM calls to the stateful agent paradigm.

## Kazanımlar

- Skillfully manage the stateless LLM → stateful agent paradigm shift.
- Apply the cognitive memory taxonomy (episodic + semantic + procedural + working) to AI agent architecture.
- Make team-appropriate choices among Letta, Mem0, Zep, Cognee, LangMem.
- Decide between OpenAI / Claude / Gemini closed-source memory and self-hosted alternatives.
- Model episodic memory with a temporal knowledge graph (Graphiti + Neo4j).
- Optimize vector + graph + hybrid retrieval.
- Select the optimal embedding for Turkish (BGE-M3, multilingual-e5).
- Build memory consolidation + forgetting + KVKK-compliant delete pipelines.
- Measure memory-system quality with LoCoMo + LongMemEval + Turkish benchmarks.
- Monitor production agent memory with Langfuse + Phoenix.

<p>This training is designed to teach end to end — in Turkish — the agent-memory discipline that forms the foundational building block of the stateful AI agent paradigm: an agent that remembers its user, learns from conversation history, and sustains long-term context, going beyond classical stateless LLM API calls. The 2024-2026 period witnessed the birth of the agent-memory ecosystem: Letta (formerly MemGPT, Berkeley 2023, virtual context pagination), Mem0 (YC W24, hybrid memory layer, 25K+ GitHub stars), Zep (Series A in 2024, temporal knowledge graph), Graphiti (Zep's open-source graph engine), Cognee (GraphRAG + memory hybrid), LangChain LangMem (native memory primitives), OpenAI Memory (ChatGPT cross-conversation), Claude Projects + Anthropic Memory Tool API (2025), Google Gemini Memory (2025). In Turkey, a training that addresses this discipline end to end at the cognitive taxonomy + framework comparison + retrieval strategy + lifecycle management + production-eval triangle is virtually nonexistent — existing content either stays at short single-tool tutorials or freezes in academic papers. This program is designed to fill that gap as Turkey's most comprehensive production-grade agent-memory reference training.</p>

<p>The program's strategic backbone is the first module, which clarifies the rationale for the transition from the stateless LLM API to the stateful agent paradigm and the rapidly rising 2024-2026 agent-memory ecosystem. The classical LLM API treats every call independently and must stay within the context window limit (8K-1M tokens); the needs of agent products (personal assistant, customer-support bot, sales CRM AI, personal tutor, personal trainer AI) include dimensions that don't fit in the context window — persistent identity, long-term user knowledge, episodic recall, multi-session continuity. The evidence-based answer to 'is just enlarging the context window enough?' is no — even Gemini 2.5 Pro's 10M-token context does not eliminate the memory-layer need in terms of cost + latency + needle-in-haystack accuracy. The 2026 ecosystem map — Letta + Mem0 + Zep + Graphiti + Cognee + LangMem + OpenAI Memory + Claude Memory + Gemini Memory — is comparatively presented. The decision framework: single-user persona vs multi-user SaaS memory needs; self-hosted vs SaaS + KVKK + EU AI Act + GDPR compliance; memory-cost trade-off (storage + retrieval + LLM-call overhead) is detailed.</p>

<p>The second module adapts the four foundational memory types from cognitive science and neuroscience (Tulving 1972 taxonomy) to the AI-agent plane. Episodic memory (time + event): 'what topic did the user ask yesterday, what did I answer' — mapped to a temporal database / event log (Zep, Graphiti). Semantic memory (conceptual knowledge): 'user is vegan, has a gluten allergy' — mapped to a vector store + knowledge graph (Mem0, Cognee). Procedural memory (how-to): 'for this question type, first look at the user's data, then call an external API' — modeled as workflow templates + tool-calling patterns. Working memory (short-term active context): managed via in-context LLM prompt + a structured state object. The memory consolidation pipeline (hot working → warm episodic → cold semantic) runs as a periodic background worker with LLM summarization + fact extraction + deduplication. Forgetting lifecycle discipline (TTL, importance-based pruning, exponential decay) is covered in detail.</p>

<p>The third module covers in detail the virtual context-management framework born from Berkeley's late-2023 MemGPT paper and incorporated as Letta in 2024. MemGPT architecture: a three-tier hierarchical memory of main context (LLM prompt) + recall storage (recent conversation) + archival storage (persistent long-term memory); OS-style memory-paging logic; self-editing memory (the LLM updates its own memory via core_memory_append, core_memory_replace tools). Agent creation + memory configuration with the Python letta SDK; PostgreSQL backend + archival memory persistence; Letta Cloud vs self-hosted Docker deployment. Persistent agent persona templates + shared memory blocks; multi-agent setup: agent-to-agent message passing + shared archival memory; debugging with the Letta ADE (Agent Development Environment). The reference framework that all teams building stateful chatbots in Turkey should know.</p>

<p>The fourth module covers in detail Mem0 — which came out of the YC W24 batch and reached 25K+ GitHub stars in 2024. Mem0's difference: a hybrid memory layer that unites vector store + knowledge graph + chat history behind a single API — simple add() / search() / get_all() / update() / delete() API; memory extraction pipeline (chat → LLM → fact → embedding); conflict resolution (when a new fact conflicts with an old fact). Multi-tenant memory isolation with user_id + agent_id + session_id; LLM-based memory extraction + fact deduplication; Mem0 Platform (SaaS) vs self-hosted + Qdrant / Pinecone backend. LangChain + LangGraph + LlamaIndex + CrewAI + AutoGen integrations; Mem0 + OpenAI Assistants API + Claude Agent SDK production patterns. Thanks to the hybrid approach, semantic similarity + structured knowledge + conversation history converge in a single memory layer.</p>

<p>The fifth module provides a detailed analysis of Zep (Series A in 2024) and its open-source temporal graph engine Graphiti. Zep's difference: modeling episodic memory as a temporal knowledge graph — each fact's validity range (valid_from, valid_to), event ordering, contradictory-fact resolution, point-in-time queries ('what the user said yesterday' vs 'what the user said today'). Bi-temporal model: valid time (real-world time) vs transaction time (write-time to the system) distinction; dynamic graph extraction using Neo4j + LLM combination with Graphiti (conversation → entity + relationship nodes); point-in-time retrieval with Neo4j Cypher queries; hybrid search (semantic similarity + BM25 + graph traversal). Zep SDK + session management + user fact graph; Zep Cloud vs self-hosted Docker + Neo4j community edition; customer support, sales CRM, personalized AI use cases are shown practically.</p>

<p>The sixth module covers in detail Cognee's (open-source, MIT licensed) architecture combining GraphRAG and agent memory in a single pipeline. Cognee tasks: chunking → entity extraction → relationship inference → graph populate → summarization; default ontology vs custom Pydantic ontology definition; choice of Neo4j / Memgraph / Kuzu backend; vector store backend (Qdrant / pgvector / Weaviate). Async pipeline orchestration + batch processing; Cognee + LangChain agent-memory integration; Cognee Cloud vs Docker self-hosted deployment. Ideal for teams who want to turn RAG into persistent agent memory — especially in scenarios where you want to integrate enterprise documents + chat history into a single knowledge graph.</p>

<p>The seventh module covers in detail LangMem (memory-primitives library) launched by the LangChain team in 2024-2025, and the Store + Checkpoint memory layers inside LangGraph. LangMem primitives: semantic memory (fact extraction + storage + retrieval), episodic memory (event log + temporal queries), procedural memory (workflow templates + tool-selection patterns). LangGraph BaseStore interface + memory namespace organization; InMemoryStore (development) vs PostgresStore vs RedisStore (production) backends; Checkpoint (thread state) vs Store (cross-thread memory) distinction. With create_manage_memory_tool the agent manages its own memory; create_search_memory_tool semantic-search integration; background memory consolidation (hot → cold tier). The lowest-friction native memory approach for teams using LangChain.</p>

<p>The eighth module covers in detail closed-source providers' 2024-2026 memory products. OpenAI Memory: ChatGPT cross-conversation memory + Custom GPTs persistent context; Memory API endpoints + read / write / delete operations; Assistants API thread + run + message persistence. Claude Projects: file knowledge base + conversation history; Anthropic Memory Tool API (2025 — file-system memory + claudemd persistent context); Claude Skills + persistent memory patterns. Google Gemini Memory (2025) overview. Closed-source vs self-hosted memory decision matrix; data residency (us vs eu region) + KVKK Turkey usage; cost (provider memory pricing vs self-hosted cost) comparison. A KVKK + EU AI Act + GDPR-compliant closed-source memory usage guide is provided.</p>

<p>The ninth module addresses modern agent memory's core retrieval architecture. Limitations of pure vector search: semantic similarity ≠ relevance (the 'mentioning the same domain' problem), optimal embedding model selection for Turkish (BGE-M3, multilingual-e5-large-instruct, jina-v3, text-embedding-3-large, voyage-3); relevance optimization with reranking (Cohere Rerank 3.5, BGE-reranker, Voyage rerank-2.5). Limitations of pure graph traversal (scale + fuzzy matching) and the strength of the hybrid approach: multi-hop reasoning ('the country of X's boss'), graph traversal + vector search hybrid query planning, the Microsoft GraphRAG approach with community detection + summary. Hybrid scoring: time-decay (time-weighted scoring with exp(-λ · age)), importance-weighted (fact-importance scoring with an LLM judge), combining multi-strategy with Reciprocal Rank Fusion (RRF).</p>

<p>The tenth module addresses the lifecycle discipline ensuring sustainability of production agent memory. Consolidation pipeline: a background worker performing fact extraction + deduplication from hot memory; contradiction resolution (does a new fact invalidate an old one?); compressing long conversations via summarization. Forgetting strategies: TTL (Time-To-Live) + LRU eviction; importance-based pruning (LLM judge deciding if each fact is 'forgettable'); exponential decay (time-based expiry with 1/(1 + λ · age) score). Memory versioning: snapshot + rollback (reverting wrong updates); KVKK 'right to be forgotten'-compliant user-data delete pipeline; audit log (who read/wrote which memory and when). Without this discipline, production memory bloats over time and cost + latency explode.</p>

<p>The eleventh module addresses the evaluation discipline that systematically measures memory-system quality. LoCoMo (Long Conversation Memory Benchmark, 600-turn dialog benchmark published by the Mem0 team), LongMemEval (long-term memory comprehensive eval), MemoryBank; Turkish NaturalConv-tr benchmark production. Memory quality metrics: Recall@k (actual-relevance rate of retrieved memories), faithfulness (is the answer supported by the retrieved memory?), evaluating memory-citation quality with LLM-as-judge. Production memory monitoring: memory-call latency overhead (typical 50-300ms), cost (memory-retrieval tokens + LLM consolidation cost), user continuity score ('does the agent remember me' NPS) — with Langfuse + Phoenix + Arize integration.</p>

<p>In the capstone module, each participant designs an end-to-end agent-memory system tailored to their own scenario: scenario selection (personal AI assistant, enterprise customer-support bot, sales CRM AI, personal tutor agent, healthcare assistant, financial advisor), memory framework (Letta / Mem0 / Zep / Cognee / LangMem), backend (PostgreSQL + Qdrant + Neo4j), retrieval strategy (vector + graph + hybrid), consolidation + forgetting pipeline, eval framework, KVKK-compliant deployment, 90-day production roadmap. By the end of the training, participants reach a level of technical competence to skillfully manage the stateless LLM → stateful agent paradigm shift; apply the cognitive memory taxonomy to agent architecture; make team-appropriate choices among Letta / Mem0 / Zep / Cognee / LangMem; decide between OpenAI / Claude / Gemini closed-source memories and self-hosted alternatives; optimize vector + graph + hybrid retrieval; build consolidation + forgetting + KVKK-compliant delete pipelines; measure memory-system quality with LoCoMo + LongMemEval + Turkish benchmarks; and monitor production agent memory with Langfuse + Phoenix. The training consists of 3 days, 12 modules, and over 100 hands-on lessons.</p>