Video and audio files are compressed thanks to the so-called data redundancy. How does this happen? Imagine you spent five minutes recording a seascape – like the one in the picture:
Let’s say your camera shoots at 30 fps. This means that it saves 30 unique images every second in memory. So in five minutes (300 seconds), it shoots 9,000 frames!
But realistically, what can drastically change in your landscape within one second? Will the sky turn green? Will the water evaporate?
Even if there are some changes, they will be smooth, not sudden. Consequently, your camera is shooting 30 almost-identical frames every second.
So why keep all these identical frames in their entirety in the camera’s memory? To record a landscape video, a codec has to save the original frame, find all the similar ones, and delete the repeated image elements. During playback, the codec just inserts the variable parts into the original image. If something else changes, the codec will identify another original frame and all its derivatives and repeat the process. This algorithm is called “motion compensation” and is one of the main ways video data are compressed.
Motion compensation is just one of the many actions taken by video codecs when processing camera recordings. Audio codecs use their own methods for removing redundant information. As a result of the codecs’ work, most redundant data are deleted from audio and video streams, enabling the size of the decoded file to be significantly reduced.