Why think of it as a compression problem? Isn’t the spy device already getting compressed video form some source? That makes it a filtering problem. You would set it to grab and ship key frames (or equivalent term) if you wanted a human to be able to see the intel. But for content matching, maybe count some interval of key frames and then grab the smallest difference frame between the next two key frames. Gives a nice, premade small data chunk. A few of those in sequence starts looking like a hash function (on a dark foggy night).
Would want some way to sync up the frames that the spy device grabs and the ones grabbed when building the db to match against. Maybe resetting the key frame interval counter when some set of simple frames come through would be enough. Like anything with a uniform color across the whole image or something similar.
Just spitballing here. I like your impulse to math this.
I assumed HDMI had some form of encoding, thanks for the correction. Looks like v 2.1 does.
I think the syncing idea between the spy device and db is still useful. The video itself has stuff to use for reducing the search space by making sure they puck the same instants to fingerprint and exfiltrate.