# Long-Context Dataset Engineering: NIAH, RULER, and Data for 128K Context FT

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-long-context-niah-ruler-128k
> Updated: 2026-05-14T14:42:51.066Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part II — Tokenizer & Data Engineering
**TLDR:** Actually using Llama 3.1's 128K context: how to produce long-context SFT data? NIAH synthetic, RULER benchmark recipes, long-form QA datasets, code-repo concatenation, repository-level context. Long-context QLoRA (128K seq) on RTX 4090 — 22GB peak with packing.

