[Originally posted 16 December 2010]
This is the first of a series of Awkward Questions. It’s a bad idea to carry on with inefficient, costly practices just because “that’s the way it has to be”. I would like to challenge some of the basic assumptions about the way digital video and audio is created, distributed and consumed.
If you host, deliver or distribute video/audio content for multiple clients, you may well be faced with the problem of encoding the same video clip into a number of different codec formats, such as H.264/MPEG-4 Part 10, VC-1 (WMV), VP6, VP8, MPEG-2, etc. This is known as transcoding and it’s time-consuming, expensive and wasteful of storage space. Every time you convert from one lossy codec format into another, video quality is reduced (“generation loss”).
So, why encode the same video file into multiple formats? The usual answer goes something like this: (a) there are many different formats, (b) each client can only handle some of the formats, (c) it is necessary to re-code your source material into as many formats as possible, work out which format a particular client needs and deliver that file to the client.
Wouldn’t it be better to do things differently? Given the ever-increasing computational power and programmability of client platforms – your new TV probably has several powerful DSP chips inside – why not (a) pick just one codec format for your video and (b) embed the decoding instructions in the video file itself? The client doesn’t need to know the format in advance, it simply uses these instructions to correctly decode the file. Your server doesn’t need to guess in advance which codecs each client can support. You can pick the best codec for your content, rather than re-coding it into many different formats.
Go on, ask the awkward question and don’t accept the stock answer.