# Modern LLM Embedding Layer + Embedding Tying: Input/Output Sharing and Scaling

> Source: https://sukruyusufkaya.com/en/learn/llm-muhendisligi/modern-llm-embedding-tying-input-output-paylasim
> Updated: 2026-05-13T13:00:27.195Z
> Category: LLM Mühendisliği
> Module: Module 7: Embedding Layer — The Vector Space of Meaning
**TLDR:** Embedding layer in modern transformer architecture: nn.Embedding initialization (Llama-3 style), embedding tying (input/output sharing) — mathematical justification and memory savings, embedding scaling before pre-layernorm (sqrt(d_model) or not), no position addition before RoPE, multimodal embeddings (vision + audio tokens). Architectural differences between Llama-3, GPT-4o, Claude-3.

