Wednesday 17 June 2009

Theora: it's not totally useless...

I've been taking a brief look at Theora, since it seems to be winning its battle to be one of the standard HTML5 video-tag codecs.

It's not entirely stupid: it's obviously been designed for PCs rather than by anyone involved in TV or video conferencing and it's in much the same state as WMV9 Main Profile (which is nearly VC-1 main profile, but not quite ..).

Theora turns out to be a fairly standard I- and P- only block structured codec. We have no B-frames, but we can predict either from the preceeding frame or the preceeding I-frame (Theora calls them INTRA or INTER rather than I and P). Theora is progressive-only and nominally fixed frame-rate only though I suspect variable frame rate by PTS will become quite common.

Theora's blocks are the right size (8x8). It has two block groupings - macroblocks of 2x2 blocks and superblocks of 4x4, with blocks arranged in a Hilbert curve rather than in raster order the way MPEG-2, H.264 and VC-1 do.

Raster order for Theora is bottom-to-top left-to-right, so (0,0) is bottom left rather than top left. It's unclear why this happened.

There's a fairly conventional three-plane colour structure, two supported colour spaces (NTSC-M and PAL), you can code in 4:2:0, 4:2:2 or 4:4:4, and chroma sits between luma samples in both X and Y - there's no variable luma positioning.

The decoded region is in whole macroblocks, but the visible frame can be any window on it so we can have arbitrary amounts of invisible picture. It appears that superblocks are the unit of coding but macroblocks the unit of motion compensation.

There's a fairly conventional MV/residual/deblock filter structure with only one, fairly simple in-loop deblock. You'll probably want an out-of-loop dering and deblock for low bitrate.

The transform is a quite particularly implemented DCT - the butterflies and cos approximation values are specified in the spec. It's effectively yet another explicit integer-only frequency transform and a quick read suggests that it's exact.

Motion vector derivation and motion compensation is pretty standard; we get motion vectors down to quarter-pel and the filter is a round-and-average beast rather than anything FIR-like.

The bitstream is run-length Huffman coded and packets are bit-counted, so we have to rely on out-of-ES framing to recover from synchronisation errors. There's obviously no emulation prevention. Presumably for coding efficiency reasons Theora groups bits by role rather than by macroblock, so we get all the coding markers, then all the MVs, then all the coefficients - a bit like some of the bitplane coding in VC-1.

This means we need more memory (and more memory I/O) than necessary, but it's not entirely fatal and at least we get the coefficients last.

Theora has almost entirely dynamic quant and coding tables, stored in the decoder initialisation headers, which may be quite big - the standard suggests 16kish. This means that for effective MPEG-TS/PS/PES use we're going to need some kind of out-of-ES SPS framing and effectively means that Theora has no ES. This is a right pain, both because it means that Theora ES streams don't exist and because it means we can't optimise quant tests in zigzag decode.

The zigzag table, oddly, is fixed, so we can optimise that. Go figure.

The Ogg framing format is quite odd - its plethora of structures is reminiscent of ASF - but should be dealable with. It's clearly where Matroska got its odd thread ideas from.

Next up: Dirac ..

Sunday 14 June 2009

initramfs and how not to claw your eyes out whilst living with it.

I've been trying to persuade muddle to get Linux booting - and it now can, at least up to saying 'hello world' from a C program that masquerades as init.
This turns out to be a little bit tricky. The key bit of documentation is in the kernel source - Documentation/filesystems/ramfs-rootfs-initramfs.txt. Important things to know include:

  • 2.6 kernels understand compressed (gzipped) cpio archives as a ramdisk format. This means you can build them as an ordinary user, using cpio -Hnewc, without needing to be root.
    This being UNIX there are, of course, many different kinds of cpio archive. The -H newc above forces hex SVR4 interoperable cpio format, which turns out to be vaguely documented by the LSB.
    This means that you can build ramdiscs without needing to be root anymore - the muddle cpio deployment will do this for you in the right format (and incidentally there is some stealable python in the cpiofile.py file that will build you SVR4 cpio archives - there doesn't seem to be a standard python module to do this).

  • On boot, the kernel looks for a suitable initrd in three places:

    • In the ramdisk built with the kernel.
    • In any initrd it finds in memory, provided by the bootloader.
    • In the root fs.

    Since a ramdisk is always built into the kernel (to avoid testing for the case where there isn't one), it can't do this by mounting the first disk it comes across and assuming it's the one to use. So it mounts each in turn and tries to find /init to execute. If there isn't one, it tries the next method.
    We actually want to use a separate ramdisk in this case to avoid having to faff with asking the kernel to incorporate our ramdisk in its image, which makes the build system marginally nastier than it needs to be.

  • (aside) you will note that the bootloader is obliged to parse the kernel command line for an initrd parameter so that it can make the ramdisk available - I'm currently using syslinux which does this for you; uboot and other, more primitive, bootloaders will probably require you to package the initrd with the kernel - I'll cover this in a later article when I actually have to do it :-). This is how the initrd data gets made available despite the kernel not having yet loaded the module which gives it access to the filesystem on which a separate initrd might reside.

  • Hence, if your cpio archive doesn't contain a /init the kernel can execute, the kernel will simply skip on and issue a 'can't mount root' error message which implies that it thought there was something bad about your cpio archive. This may be a lie - perhaps your cpio archive was fine and just didn't contain the right file. Attempting to set root=/dev/ram0 won't work either - cramfs can't understand compressed cpio archives.

  • To build a working initrd, all you need is the libgcc_1 and libc6 packages from ubuntu (muddle now has a deb package builder which will rip the useful data out of these for you, but dpkg-deb -X will do just as well), and a /init . cpio -Hnewc -o these together into an initrd file.

  • You can test this fairly easily with qemu -kernel my_kernel -initrd my_initrd -append noapic (qemu doesn't love 2.6.30's APIC support and I expect this is true of other kernel versions too).


And that's all, folks. I'll package this up into a muddle example at some point - leave a comment or poke me by email if you're interested and it's not on trunk.

My god it's full of HTML editors ..

OK. First post ..

Welcome to rrw's blog. This is mostly going to be full of techie articles (read: exasperated rants) and is the blog that will shortly be linked from the Kynesim (http://www.kynesim.co.uk/) website. There's an associated twitter account - rrw1000 - which I should work out how to federate with updates here.

Anyway, I own and run a small technology development consultancy in Cambridge, UK. We specialise in embedded systems, home automation and IPTV and my expertise is in operating systems and firmware.

I'm the principal author and maintainer of muddle (http://code.google.com/p/muddle) - a tool for building embedded systems firmware.