Entropy coding/CABAC/CAVLC

Entropy Coding is a lossless data compression scheme that is independent of the specific characteristics of the medium. One of the main types of entropy coding creates and assigns a unique prefix code to each unique symbol that occurs in the input. These entropy encoders then compress data by replacing each fixed-length input symbol with the corresponding variable-length prefix codeword. The length of each codeword is approximately proportional to the negative logarithm of the probability. Therefore, the most common symbols use the shortest codes. According to Shannon's source coding theorem, the optimal code length for a symbol is logbP, where b is the number of symbols used to make output codes and P is the probability of the input symbol. The two most commonly used entropy encoding techniques are Huffman coding and arithmetic coding.

Context-Adaptive Binary Arithmetic Coding (CABAC) - Lossless compression technique for obtaining a higher compression ratio. Uses binarization and Arithmetic Coding based on a suitable probabilistic model that depends on the context (sequence element that contains the current symbol) and is known in advance.

Context-Adaptive Variable-Length Coding (CAVLC) - lossless compression technique that is used to encode residual, zig-zag ordered 4x4 (and 2x2) blocks of transform coeﬃcients. CAVLC is designed to take advantage of several characteristics of quantized 4x4 blocks.

Binary Arithmetic Coding - Uses a binary representation of the Arithmetic Coding result. Or encoding can be done using the doubling interval technique, and the result is immediately represented as a binary number.

Arithmetic Coding - Coding technique for symbol sequence that uses the division of the numerical interval into intervals according to the probability of symbol occurrence in the given sequence. In the case of arithmetic coding, the number of intervals can be unlimited and depends only on the number of encoded symbols. Encoding is performed by recursive dividing of the current interval to several intervals corresponding to symbols probabilities and selecting current symbol interval as next recursive step, then this interval is divided into new intervals in accordance with the probabilities of the appearance of symbols in the entire sequence. The result of encoding will be the number belonging to the last interval. To restore the original sequence, recursive division of the original interval is used in accordance with the probabilities of symbols occurrence and the choice of a symbol for the interval, which contains the number obtained during encoding.