I want to explain the RIFF / WAV format with some screen shots!
The following WAV file has been imported by Wavelab 5, ID3 tagged with Tag & Rename + “WAV” tagged with Sony Soundforge 9.0e.
A chunk (e.g. “WAVE”) contains multiple subchunks (e.g. "fmt ", “data”, “LIST”, “bext”, "id3 ", etc.). After the chunk id (4 bytes) follows the first subchunk id (4 bytes), followed by the size of the subchunk data (4 bytes) and then the subchunk data itself and then the next subchunk id and so on.
When you open this WAV file with a hexdump tool you would find the following:
- Screen Shot:
At first you can see the id “RIFF” (red) and after that the size (green) of the following complete data ($000D7B8C - backwards!). When you add the next address $00000008 and the size of $000D7B8C you will have as result $000D7B94 which is the end of the file (see last screen shot).
As next starts the “WAVE” chunk (blue). This chunk contains a lot of subchunks:
Normally the first subchunk is "fmt " (orange) followed by the size of this subchunk (violet). This subchunk contains data like audio format, bits per sample, sample rate, number of channels, etc. When you add the next address $00000014 + the size of the subchunk $00000010 you land …
at $00000024, where the subchunk “data” starts (yellow), followed by the size and the sound samples itself. Adding again the next address (after the size) $0000002C + the length of the subchunk $000D770C is $000D7738.
Let’s look at $000D7738:
The subchunk “LIST” + the size $00000092. This subchunk specifies the blocks “INFO” and “exif” (see here). After “INFO” you can find the first tag “INAM” followed by the length $00000012 of the following text “You’re The First” (incl. 2 zero bytes) and then the next tag and so on.
Following the logic, $000D7740 + length $00000092 = $000D77D2 - here starts the subchunk “bext” - specified anywhere. Following the logic, $000D77DA + length $0000025A = $000D7A34 …
which lets us begin with the next subchunk - which is "id3 " - see screen shot:
It starts with the subchunk id "id3 " + size and after that the ID3 specification (id3.org) begins - which you can find in typical MP3’s.
$000D7A3C + $00000158 = $000D7B94 - which is the end of the file (the same as on the top).
So as you can see, the logic of subchunk + length of a subchunk makes it possible to add various subchunks.
If a program can not understand a specific subchunk - it must store it with the whole length and when the (changed) file will be saved again it writes the subchunk away as it has read it.
That’s the specification but a lot of programs are not able to read the correct chunk & sub chunk logic.
I hope this helps.