2008/3/17, Ben Hutchings <[EMAIL PROTECTED]>: > The conversion process does *not* remove ID3v1 tags, so you may be able > to recover by deleting the ID3v2 tags (id3v2 -d).
No, somehow this does not recover the information. See a test below. > What encoding was used in the ID3v1 tags? ID3v1 does not have any flag > to indicate encoding and is normally assumed to use ISO 8859-1. Text > with this encoding seems to be converted correctly. I am not 100% sure, but I think I have a mixture of UTF-8 and ISO8859-13 files. AmaroK, easytag and others detect and display it correctly without any assistance. If they use LANG environment to guess the encoding then it must be UTF-8, perhaps with some kind of smart fallback to ISO8859-13 when reading (if id3 v1 *really* lacks the encoding info). I did the following test: 1) recorded blank mp3 2) added/edited the tag with amarok (amarok and easytag display it correctly, id3v2 shows that only id3 v1 tag is present, and UTF-8 characters are broken and interpreted as ISO8859-1) 3) did the conversion with "id3v2 -C" (id3v2 shows that id3 v1 and v2 are present, all tools show broken UTF-8 characters) 4) stripped with "id3v2 -d" (id3v2 shows that only id3 v1 tag is present, all tools show broken UTF-8) So my conclusion is that the other tools somehow know what is the correct encoding and correctly interpret it, but id3v2 overwrites this information effectively killing the method used by other tools. Interestingly, easytag suggests to save some(?) tag information on broken-by-id3v2 files, although I did not change anything. My blind guess is that it found that the encoding information is missing and wants to write something generic there, although results do not improve (for obvious reasons). I've put the files from the test here (perhaps you can dig it with hex dumps): http://www.cs.aau.dk/~marius/id3v2 -- Marius Mikučionis