📚 Module 2: PEFT — The Efficient Fine-Tuning Paradigm

2.1 Introduction to PEFT

PEFT (Parameter-Efficient Fine-Tuning) is a set of techniques that allow adapting large models by updating only a small fraction of their parameters while keeping the rest frozen. This drastically reduces memory requirements, accelerates training, and mitigates the risk of catastrophic forgetting.

The fundamental principle of PEFT is simple: not all model parameters need to be updated for the model to learn a new task. In many cases, introducing small structured modifications — such as low-rank matrices, adapter layers, or activation shifts — is sufficient to guide the model toward the desired behavior.

PEFT is not a single method but a family of techniques including:

LoRA (Low-Rank Adaptation)
Adapter Layers
Prefix Tuning / Prompt Tuning
QLoRA (Quantized LoRA)

Among these, LoRA and QLoRA are currently the most popular and widely adopted in the community, thanks to their simplicity, effectiveness, and excellent support in libraries like Hugging Face PEFT.

2.2 Advantages of PEFT Over Full Fine-Tuning

Feature	Full Fine-Tuning	PEFT (LoRA/QLoRA)
Trainable Parameters	100%	~0.1% - 1%
VRAM Requirement	Very High (tens of GB)	Low to Moderate (can run on 16GB)
Training Time	Long	Short to Moderate
Risk of Catastrophic Forgetting	High	Low
Model Portability	Entire model must be saved	Only PEFT parameters (small files) are saved
Reusability of Base Model	Not directly possible	Yes: multiple adapters can be loaded onto the same base model

2.3 When to Use PEFT

PEFT is ideal in the following scenarios:

Limited hardware is available (GPU with 16GB or less).
The fine-tuning dataset is small (<10,000 examples).
You want to preserve the base model’s general knowledge.
You plan to adapt the same base model for multiple tasks (multi-task learning).
You aim to reduce training and storage costs.
You are prototyping rapidly or conducting iterative experimentation.

PEFT is not recommended when:

The fine-tuning dataset is extremely large and diverse (in which case full fine-tuning may yield better results).
A radical change in the model’s output distribution is required (e.g., completely changing vocabulary or linguistic domain).
Unlimited access to high-end hardware is available and time is not a limiting factor.

← Module11 Module3 →

Course Info

Course: AI-course3

Language: EN

Lesson: Module2

📚 Module 2: PEFT — The Efficient Fine-Tuning Paradigm

2.1 Introduction to PEFT

2.2 Advantages of PEFT Over Full Fine-Tuning

2.3 When to Use PEFT

Table of Contents

Course Info