The structural similarity index measure (SSIM) is a method for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos. SSIM is used for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measurement or prediction of image quality is based on an initial uncompressed or distortion-free image as a reference. The SSIM index is a development of traditional methods such as PSNR (peak signal-to-noise ratio) and the MSE method, which turned out to be incompatible with the physiology of human perception. The difference with other techniques such as MSE or PSNR is that these approaches estimate absolute errors. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene.
The SSIM index is calculated on various windows of an image. The measure between two windows x and y of common size N×N is:
The above formula is applicable only for the brightness of the image, which is used to assess the quality. The resulting SSIM index ranges from −1 to +1. A value of +1 is achieved only with the complete authenticity of the samples. Typically, the metric is calculated for an 8 × 8 pixel window. The window can be displaced by a pixel, but experts recommend using groups of windows to reduce the complexity of the calculations.
A more advanced form of SSIM, called Multiscale SSIM (MS-SSIM), is performed at multiple scales through a multi-step downsampling process, reminiscent of multiscale processing in the early visual system. It has been shown to perform equally well or better than SSIM with various databases of subjective images and videos.