Generative AI and LLM

44 terms in the Generative AI and LLM domain — each bilingual TR/EN with related-term graph.

Generative AI FundamentalsFoundation ModelsLarge Language ModelsPrompt EngineeringFine-TuningPEFT / LoRAInference OptimizationQuantizationGroundingAlignmentHallucinationMultimodal Generative AI

All Terms (44)

3 terms

🛑

Abstention

The ability of a model to avoid fabricating certainty and instead decline or express uncertainty when it is not confident.

🧱

Adapters

A parameter-efficient approach that inserts small modules into the base model to enable task adaptation.

➡️

Autoregressive Decoding

A generation mode in which the model produces output token by token using previous outputs as context.

4 terms

🧠

Catastrophic Forgetting

The problem in which a model loses some of its prior general abilities while being adapted to new tasks.

🔗

Citation Grounding

An approach that improves trust by explicitly showing the source passages supporting the generated answer.

📜

Constitutional AI

An alignment approach that tries to guide model behavior through explicit principle sets and normative rules.

📦

Continuous Batching

A serving approach that increases throughput by dynamically merging requests arriving at different times into the same processing flow.

2 terms

⚖️

Direct Preference Optimization

A simpler alignment approach that learns directly from preference pairs.

🏥

Domain-Adaptive Fine-Tuning

An approach that adapts a model to the terminology and usage style of specific domains such as law, healthcare, or finance.

1 terms

🌟

Emergent Capabilities

Task behaviors that appear significantly stronger once a model reaches a certain scale.

1 terms

📌

Factuality

A quality dimension describing how well generated content aligns with real-world facts, source data, or verifiable truth.

1 terms

✨

Generative Model

A family of models that can generate new samples rather than only predicting labels.

1 terms

🌫️

Hallucination

The phenomenon in which a model generates fluent but unsupported or incorrect content.

3 terms

4️⃣

INT4 Quantization

An aggressive quantization approach that reduces the model to 4-bit precision for much lower memory cost.

8️⃣

INT8 Quantization

A common quantization form that reduces weights and sometimes activations to 8-bit precision for balanced efficiency and quality.

📋

Instruction Model

A version of a general language model adapted to follow task instructions more effectively.

1 terms

⚙️

KV Cache

A mechanism that stores previous attention computations to reduce repeated work in autoregressive generation.

1 terms

🧩

LoRA

A popular PEFT method that enables efficient fine-tuning by representing weight updates with low-rank matrices.

3 terms

🧠

Mixture of Experts

An approach in which only relevant expert subnetworks are activated for each input to achieve scale and efficiency.

💾

Model Checkpoint

A saved model state captured at a certain stage of training and reusable later.

🧠

Multimodal Transformer

A model design that processes different data types such as text, images, audio, or video within a shared attention architecture.

6 terms

🗂️

Paged Attention

An attention-management technique that handles KV cache memory more efficiently and improves resource use under multi-request serving.

🪶

Parameter Efficient Fine-Tuning

A fine-tuning approach that adapts a model using a limited number of parameters instead of updating the full model.

📉

Post-Training Quantization

A quantization approach that reduces a pretrained model to lower-bit precision to gain memory and speed benefits.

🏷️

Prefix Tuning

A PEFT technique that steers the model’s internal attention behavior through small learnable prefix representations.

🏗️

Pretraining

The initial training stage in which a model learns broad patterns from large-scale general data.

📄

Prompt Template

A parameterized prompt pattern that provides reusable and consistent structure across repeated tasks.

2 terms

💾

QLoRA

An approach that performs LoRA adaptation on a quantized base model to enable fine-tuning at lower hardware cost.

⚙️

Quantization Aware Training

An approach that trains the model under low-precision conditions to preserve quality after quantization.

2 terms

🎛️

Reinforcement Learning from Human Feedback

An alignment approach that uses reward signals to make model outputs more consistent with human preferences.

🏆

Reward Model

An auxiliary model that estimates how preferable generated outputs are and provides signals for alignment.

5 terms

🎲

Sampling

The process of making probabilistic choices from a learned distribution while generating new output.

📈

Scaling Laws

A set of empirical regularities describing how performance changes as model size, data, and compute increase.

⚡

Speculative Decoding

A decoding approach that speeds up generation by validating proposals from a smaller fast model with a larger model.

🎇

Stochastic Generation

A generation mode that introduces probabilistic diversity instead of producing the exact same output every time.

🧭

System Prompt

A high-level instruction layer that defines the model’s overall behavior, role, and priorities.

6 terms

🌡️

Temperature Sampling

A parameter that adjusts output distribution sharpness to produce more controlled or more creative generation.

🖥️

Tensor Parallelism

A technique that scales inference and training by splitting large model computations across devices within layers.

🖼️

Text-to-Image Generation

A generative modeling approach that synthesizes new images from natural language prompts.

🔤

Tokenizer

A core intermediary layer that converts text into tokens the model can process.

🛠️

Tool-Augmented Generation

An approach in which the model uses tools such as computation, search, or external system calls to produce more accurate results.

🔁

Transferability

The ability of a model to transfer what it learned during pretraining into different tasks and domains.

1 terms

📏

Uncertainty Calibration

A quality approach aimed at making model confidence better aligned with actual correctness.

1 terms

✅

Verification Loop

A workflow pattern that attempts to validate model output through additional checks, source review, or second-stage verification.

Generative AI and LLM

Most Read

All Terms (44)

Abstention

Adapters

Autoregressive Decoding

Catastrophic Forgetting

Citation Grounding

Constitutional AI

Continuous Batching

Direct Preference Optimization

Domain-Adaptive Fine-Tuning

Emergent Capabilities

Factuality

Generative Model

Hallucination

INT4 Quantization

INT8 Quantization

Instruction Model

KV Cache

LoRA

Mixture of Experts

Model Checkpoint

Multimodal Transformer

Paged Attention

Parameter Efficient Fine-Tuning

Post-Training Quantization

Prefix Tuning

Pretraining

Prompt Template

QLoRA

Quantization Aware Training

Reinforcement Learning from Human Feedback

Reward Model

Sampling

Scaling Laws

Speculative Decoding

Stochastic Generation

System Prompt

Temperature Sampling

Tensor Parallelism

Text-to-Image Generation

Tokenizer

Tool-Augmented Generation

Transferability

Uncertainty Calibration

Verification Loop