# Streaming & Sharded Datasets: Training on 500GB+ Without Disk

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-streaming-sharded-datasets-large-scale
> Updated: 2026-05-14T14:42:50.980Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part II — Tokenizer & Data Engineering
**TLDR:** A 1TB dataset fits on 4090's 2TB NVMe, but tokenizing and caching needs 5TB. Solution: streaming. HF datasets.IterableDataset, WebDataset .tar shards, MosaicML Streaming (MDS), S3 streaming, resumable streaming, multi-worker collator pattern.

