The problem with Audio Description implementation on Freeview is that it isn't done in a way that is backwards compatible with old receivers, as the commentary is broadcast on its own on a separate audio channel.
This means the box needs to be capable of decoding both the normal audio channel and the AD audio channel and mixing them together,
Any idea why it was chosen to do it that way? It seems a bit overly complicated. If the AD is pre-mixed with programme audio you have the ability to lower the volume during things like establishing shots to allow the AD through. If that's still possible it must mean some other kind of data going along with the AD to control programme volume.
I imagine its to save bandwidth. Audio only can be a lower bit rate than something which includes a larger variety of noises/music and it can be in mono rather than stereo.
That's part of it, yep.
When the Audio Described programme leaves the playout suite, it's got regular stereo audio on tracks 1 & 2. Track 3 is the narration, and track 4 is a control track - it sounds a bit like
Linear Timecode, but it's actually a data track which contains instructions on what to do with the main programme audio. There's a description of it in this
BBC R&D White Paper [pdf] - it's quite robust and can survive being compressed to buggery*.
During the coding & mux process for the DSat platform, there's a bit of kit which does the downmix, and creates a new audio pair. For
DTT, it's just transmitted as-is, and the set-top-box does the mix.
Some DTT STBs don't pay any attention to the control track (or lack thereof) and will just mix in whatever it finds on the third audio track. This has caused problems in the past where an OB which was working into another OB (Springwatch Unsprung, IIRC) was using that audio level for director's talkback, and so some viewers who had AD turned on got to listen to it.
* technical term