In QLoRA, the backward pass memory footprint is reduced through three core mechanisms: 4-bit NormalFloat quantization, double quantization, and the frozen state of the main weights. First, 4-bit NormalFloat (NF4) is a data type that assumes neural network weights follow a normal distribution. By mapping these weights to a 4-bit space such that each value has an equal probability of occurring, NF4 maximizes information density, allowing the massive pre-tra....
Log in to view the answer