Monday, 6 May 2013

Finally, some progress in video codecs.

An announcement on Friday via Brendan Eich:

ORBX.js, a downloadable HD codec written in JS and WebGL. The advantages are many. On the good-for-the-open-web side: no encumbered-format burden on web browsers, they are just IP-blind runtimes. Technical wins start with the ability to evolve and improve the codec over time, instead of taking ten years to specify and burn it into silicon.
I think the 'remote-screen viewing of videogames' use case is bogus (if anyone notices latency it's gamers), but this is a really important development for the reasons Brendan mentions and more.

Nine years ago, I wrote:

I'd say video compression is maybe 2-4 times as efficient (in quality per bit) than it was in 1990 or so when MPEG was standardised, despite computing power and storage having improved a thousandfold since then.

Not much has changed. The video compression techniques we're using everywhere are direct descendents of 1980s signal processing. They treat video as a collection of small 2D blocks that move horizontally and vertically over time, and encode all video this way. If you want to make a codec work hard, you just need to rotate the camera. Partly this is because of the huge patent thicket around video encoding, mostly it's because compression gets less necessary over time as network capacity and storage increases. However, it was obvious 10 years ago that this was out-dated.

Meanwhile, there has been a revolution in video processing. It's been going on in video games, and in movies and TV. The beautiful photorealistic scenes you now see in video games are because they are textured 3D models rendered on the fly for you. Even the cut scenes work this way, though their encoding is often what compression researchers dismissively call a 'Graduate Student Algorithm' - hand-tweaking the models and textures to play back well within the constraints of the device. Most movies and TV has also been through 3d-modelling and rendering, from Pixar through visual effects to the mundane superimposition of yard lines on sports. The proportion of YouTube that is animation, machinima or videogame run-throughs with commentary keeps growing too.

Yet codecs remain blind to this. All this great 3d work is reduced to small 2D squares. A trimesh and texture codec seems an obvious innovation - even phones have GPUs in them now, and desktops have for 20 years. Web browsers have been capable of complex animations for ages too. Now they have the ability to decode bytestreams to WebGL in real time, we may finally get codecs that are up to the standards we expect from videogames, TV and movies, with the additional advantage of resolution independence. It's time for this change.