CodeSourcery fails again

The bug I discovered in CodeSourcery’s 2008q3 release of their GCC version was apparently deemed serious enough for the company to publish an updated release, tagged 2008q3-72, earlier this week. I took it for a test drive.

Since last time, I have updated the FFmpeg regression test scripts, enabling a cross-build to be easily tested on the target device. For the compiler test this means that much more code will be checked for correct operation compared to the rather limited tests I performed on previous versions. Having verified all tests passing when built with the 2007q3 release, I proceeded with the new 2008q3-72 compiler.

All but one of the FFmpeg regression tests passed. Converting a colour image to 1-bit monochrome format failed. A few minutes of detective work revealed the erroneous code, and a simple test case was easily extracted.

The test case looks strikingly familiar:

extern unsigned char dst[512] __attribute__((aligned(8)));
extern unsigned char src[512] __attribute__((aligned(8)));

void array_shift(void)
{
    int i;
    for (i = 0; i < 512; i++)
        dst[i] = src[i] >> 7;
}

Continue reading

Ogg timestamps explored

Ogg is the name of a multimedia container format invented by the Xiph Foundation. Moreover, it is a deeply flawed format. One of its many flaws relates to timestamps, an aspect of Ogg I shall explore in this article.

Ogg structure

The Ogg format splits elementary stream data into a sequence of packets which are then distributed arbitrarily across pages. A page can contain any number of packets, and a packet can span any number of pages. This two-level packetisation scheme is used since the packet headers would otherwise, due to design shortfalls elsewhere, become prohibitively large.

Timestamps in Ogg

Each Ogg page (not packet) header contains a timestamp, or granule position in Ogg terms, encoded as a 64-bit number. The precise interpretation of this number is not defined by the Ogg specification; it depends on the codec used for each elementary stream. The specification does, however, tell us one thing:

The position specified is the total samples encoded after including all packets finished on this page (packets begun on this page but continuing on to the next page do not count).

The meaning of samples is, again, left unspecified. It is merely suggested that it could refer to video frames or audio PCM samples.

Timestamping the end of packets, instead of the start, is impractical for a number of reasons including, but not limited to, the following:

  • Scheduling decoded samples for playback is more easily done based on the desired start time than on the end time.
  • Virtually every other container format ties timestamps to the start of the first following sample. Doing it differently only complicates players and other tools supporting multiple formats without providing any advantage.
  • Inferring the timestamp of the first sample of the stream is impossible without first decoding, at least partially, every packet in the first page.

Timestamp interpretation

As mentioned previously, the meaning of the 64-bit timestamps associated with an elementary stream depends on the codec of the stream. I conducted a survey of codecs with defined Ogg mappings looking specifically at their timestamp definitions. Continue reading