So a recent conversation rekindled my interest in a side project I'd thought about a while ago. It's a way of automatically recording, tagging and uploading the idents and continuity announcements for BBC One and Two. A ‘Genome for idents’?!
Read on if you want to know the technicals behind it, otherwise see what you think at the reborn
The BBC send an EIT p/f signal to note when the next programme is about to start. Luckily this (generally) gets flagged at the start of the ident. So I connected up a Raspberry Pi with a DVB-T2 292e nanoStick and the Tvheadend software which can do so called 'accurate recordings'. As both channels are on the same mux I can record both without a second tuner. A pre-record script sets a timer to cut the recording after about a minute (no point letting it record the whole show). A post-record script then uploads the recording to an Amazon S3 bucket.
A few Amazon tools come into play - their Lambda functionality is triggered when a new recording has been uploaded, it sends the file to the Elastic Transcoder which cuts the duration down, compresses the video and generates the thumbnail. I tried doing this bit on the Pi but it just killed it.
Once complete another Lambda function adds it to TV Home's database. Initially I was then giving the Amazon Rekognition service the thumbnail and it would detect the scene and return words that match. On the whole this worked quite well - "bicycle" "bike" etc. for the cyclists ident and "football" "soccer" for the football training one. It wasn't perfect though and I imagine would be pretty poor for BBC Two so I switched to doing a basic image similarity instead. Each ident is stored in the DB with an example image. The accuracy is high but it will get stuck occasionally (e.g the excellent Arts idents on BBC Two)
The biggest problem at the moment is that the idents are sometimes incomplete. BBC Two is great - the flag generally seems to fire right at the start, giving all the audio and all bar the first second of video. BBC One is more haphazard - I don't know whether it's the software, hardware, a dodgy signal or the broadcaster.
The final announcement of the night always seems to record from the BBC News ident, rather than BBC One. Not sure why they would treat that junction any different, would love to know if anyone has any ideas. For other junctions, there doesn't seem to be any pattern - one day, half the recordings were opening titles rather than idents
I'd be interested to know who/what triggers it in the system.
The One Show had to be manually adjusted for a while too too thanks to it starting minutes before the published time! It also didn’t record last Sunday’s MOTD - possibly treating it as a repeat, although the software is set up to record all.
I did end up using the Rekognition service though to detect text on the thumbnail - it's a nice way of guessing whether the ident or programme was actually recorded (I can fade it out on the page if the programme).
Obviously there's loads of things that could be done with it - idents by programme, most frequently shown, highlighting first appearances etc. I could switch to HD too and avoid the regional differences but that will mean larger uploads.
This was all a bit of an experiment and a quick win to see if I could get something up and running. Hopefully some find it interesting!
Last edited by Asa on 20 September 2018 10:55pm - 3 times in total