Cost Modeling: Cache Hit/Miss'i Dolar'a Çevir

Name: Cost Modeling: Cache Hit/Miss'i Dolar'a Çevir
Author: Şükrü Yusuf KAYA

Cache hit rate ve cost arasındaki matematiksel ilişki. Spreadsheet template + Python hesaplayıcı. Yöneticine 'caching ne kadar tasarruf ediyor' raporu vermek için.

Şükrü Yusuf KAYA

13 min read

5/14/2026

Intermediate

Cost Modeling: Cache'i Yöneticine Sat

Caching'i implement ettin ama "ne kadar tasarruf ediyor?" sorusuna net cevap veremiyorsan, yatırım onayı zor. Bu derste matematiksel modeli kuracağız.

Temel Eşitlik#

Burada:

(N) = aylık sorgu sayısı
(P_{\text{static}}) = cache'lenebilir token kütlesi
(P_{\text{dynamic}}) = dinamik token (her sorgu farklı)
(P_{\text{output}}) = output token kütlesi
(h) = cache hit rate (0-1)

Cache hit rate'e göre static cost:

Python Cost Calculator#

python

def estimate_monthly_cost(
    queries_per_month: int,
    static_tokens_per_query: int,  # cache'lenebilir kısım
    dynamic_tokens_per_query: int,  # her sorgu farklı
    output_tokens_per_query: int,
    cache_hit_rate: float,
    provider: str = "anthropic_sonnet_4_6",
) -> dict:
    """Aylık maliyet detaylı breakdown."""
 
    PRICING = {
        "anthropic_sonnet_4_6": {
            "input": 3.0, "output": 15.0,
            "cache_write": 3.75, "cache_read": 0.30,
        },
        "anthropic_opus_4_7": {
            "input": 15.0, "output": 75.0,
            "cache_write": 18.75, "cache_read": 1.50,
        },
        "openai_gpt4o": {
            "input": 2.5, "output": 10.0,
            "cache_write": 2.5, "cache_read": 1.25,  # OpenAI yazma ücreti yok
        },
        "gemini_pro": {
            "input": 1.25, "output": 10.0,
            "cache_write": 1.25, "cache_read": 0.31,
        },
    }
 
    p = PRICING[provider]
    misses = queries_per_month * (1 - cache_hit_rate)
    hits = queries_per_month * cache_hit_rate
 
    # Static cost
    static_cost = (
        misses * static_tokens_per_query / 1e6 * p["cache_write"]
        + hits * static_tokens_per_query / 1e6 * p["cache_read"]
    )
 
    # Dynamic cost (her sorguda)
    dynamic_cost = queries_per_month * dynamic_tokens_per_query / 1e6 * p["input"]
 
    # Output cost
    output_cost = queries_per_month * output_tokens_per_query / 1e6 * p["output"]
 
    total = static_cost + dynamic_cost + output_cost
 
    # No-cache karşılaştırma
    no_cache_total = (
        queries_per_month * (static_tokens_per_query + dynamic_tokens_per_query) / 1e6 * p["input"]
        + queries_per_month * output_tokens_per_query / 1e6 * p["output"]
    )
 
    return {
        "static_cost_usd": static_cost,
        "dynamic_cost_usd": dynamic_cost,
        "output_cost_usd": output_cost,
        "total_with_cache_usd": total,
        "total_no_cache_usd": no_cache_total,
        "savings_usd": no_cache_total - total,
        "savings_pct": (no_cache_total - total) / no_cache_total * 100,
        "total_with_cache_try": total * 33.5,
        "savings_try_monthly": (no_cache_total - total) * 33.5,
    }
 
# Senaryo: e-ticaret asistanı
result = estimate_monthly_cost(
    queries_per_month=100_000,
    static_tokens_per_query=50_000,    # KB + system + tools
    dynamic_tokens_per_query=200,      # user query
    output_tokens_per_query=500,
    cache_hit_rate=0.92,
)
 
print(f"Static cost:      ${result['static_cost_usd']:>10,.2f}")
print(f"Dynamic cost:     ${result['dynamic_cost_usd']:>10,.2f}")
print(f"Output cost:      ${result['output_cost_usd']:>10,.2f}")
print(f"───────────────────────────")
print(f"With cache:       ${result['total_with_cache_usd']:>10,.2f}  |  {result['total_with_cache_try']:>12,.2f} TL")
print(f"Without cache:    ${result['total_no_cache_usd']:>10,.2f}")
print(f"Tasarruf:         ${result['savings_usd']:>10,.2f}  |  {result['savings_try_monthly']:>12,.2f} TL  ({result['savings_pct']:.1f}%)")
print(f"Yıllık tasarruf:                          {result['savings_try_monthly'] * 12:>12,.2f} TL")

Kendi senaryon için cost hesapla

ROI Raporu

Bu rapor formatı yöneticine sun. Yıllık 5.2 milyon TL tasarruf — yatırım onayı için yeterli.

Sensitivity Analizi: Hit Rate'in Etkisi#

Cache hit rate %5 değişince tasarruf ne kadar değişir?

Hit Rate	Aylık Cost	Tasarruf
%50	$8,460	%46
%70	$5,604	%65
%85	$3,610	%77
%90	$2,944	%81
%95	$2,278	%86
%98	$1,879	%88

%85 → %95 = %9 extra tasarruf. Hit rate optimizasyonu (Modül 4) bu yüzden değerli.

✓ Pekiştir#

Bir Sonraki Derste#

TTL stratejisi: 5m vs 1h break-even ne zaman?

Yorumlar & Soru-Cevap

(0)

Yorum yazmak için giriş yap.

Yorumlar yükleniyor...

Cost Modeling: Cache Hit/Miss'i Dolar'a Çevir

Cost Modeling: Cache'i Yöneticine Sat

Temel Eşitlik#

Python Cost Calculator#

Sensitivity Analizi: Hit Rate'in Etkisi#

✓ Pekiştir#

Bir Sonraki Derste#

Yorumlar & Soru-Cevap

Related Content

Bu Eğitim Hakkında ve Prompt Caching Neden Önemli?

Token Ekonomisi 101: Input vs Output Cost Asimetrisi

Context Window Evrimi: 4K'dan 1M'a 5 Yılda Ne Oldu?

Subscribe to Newsletter