This bug has been annoying me for as long as I can remember. I've been using Mp3tag for over a year now and thought I should report it, although looking through some of the support forum posts, I'm sure it's been touched on already. Sorry in advance if this post seems a little long-winded, but I aim to cover everything that I have found.
Scenario:
I often need to tag a large number of files and find Mp3tag a fantastic tool for allowing me to get the job done quickly. With freedb integration, tagging a ripped CD is almost as easy as just clicking a button. At least, I'm sure it is for most people.
Unfortunately, most of the files I tag use Japanese text encoded using Shift-JIS. 95% of the time, everything works correctly. But sometimes things go horribly wrong.
Example:
I'll introduce you to the bug through a worked example.
Below, we have a fictional CD single that I created for the purpose of this post containing two tracks - "Mysterious" and "Mahou no Soda" - performed live by a non-existent artist, Hanako Ishida. As you can see, everything appears almost as you would expect it to. The only clue that something isn't quite right is that the title for track 2 in the list only says "Mahou no" - the end is missing.
Now let's try track 2.
Okay, let's have one last go at trying to title our track using the tag panel to edit the title instead of editing it in-line in the list.
Although the list on the right still only shows the start of the track title.
Explanation: (or an attempt at one)
As I said at the start of this post, I've been living with this annoyance for over a year now - plenty of time to further investigate what might be causing it.
The problem only occurs when certain characters are encountered in a string, which is why the first track was easy to rename while the second wasn't. The second track title contains a "problem" character - that innocent-looking bar between towards the end that makes up part of the word "soda".
For those unfamiliar with multi-byte character encoding schemes such as Shift-JIS, each character is represented by a number of bytes, not just one as in ASCII or most traditional Western encodings. Shift-JIS uses two bytes to represent each character. In Shift-JIS, this dash-like character (a chouon, to be exact) is represented by the two bytes 0x81 and 0x5B. Testing has led me to discover that any character containing either of the bytes 0x5B, 0x5C or 0x5D causes serious problems in Mp3tag.
0x5B and 0x5D represent the bracket characters "[" and "]" in ASCII, while 0x5C is a backslash. Entering brackets normally doesn't cause any strange behaviour. It's only when the bytes are used in a sequence that Windows can identify as a multiple-byte character.
I'm unsure exactly why this is as I obviously don't know what's going on under that nice, user-friendly interface. It is possible that an internal mechanism for escaping certain bytes prior to some kind of parsing process is to blame as it would appear to be working on a per-character level rather than a per-byte level. If this is the case, it would be fine if the underlying parser also understood the concept of characters, but it seems to be byte-oriented. I am also unsure why using the text field in the tag panel on the left to alter the ID3 tag seems to bypass this parsing process.
As for the backslash, entering it in a title using either in-line in the list or using the tag panel produces some undesirable results, although the exact result differs depending on which mechanism you use (but why should it?). Surely there should be a system in place for handling/escaping user-entered backslashes?
I'd love to continue to use Mp3tag for tagging, but as the number of files I need to tag increases, this issue is becoming more and more annoying. If there is anything I can do to help or if you would like more information, please let me know. Also, if anyone has any solutions/workarounds, they would be most appreciated.
- SoZ
(Mp3tag 2.32a on Windows XP)