Skip to content
Technical GlossaryAI Fundamentals

Inference

The stage in which a trained model performs prediction or generation on new data in real-world use.

Inference is the stage where the model leaves the laboratory and begins operating in the real world. It applies what it learned during training to new data in order to predict, classify, rank, or generate outputs. In production settings, factors such as latency, cost, scalability, and reliability often become most visible during inference. For that reason, a good model is not just one that performs well during training, but one that can sustain strong performance during inference. In productized AI systems, that distinction becomes critical.