Filtering European ASCII characters

AMJF · June 14, 2014, 10:59am

I need to use the Filter to show only those artists that have European accents and umlauts (and so forth) in their names, but I don't really want to do it one character at a time (I'm talking about the ISO-8859-1 standard).

I'm enquiring about this as one MP3 album I bought had a European artist name which was badly mangled, probably because it was converted from Unicode incorrectly or something, and I just want to make sure there are no more in my MP3s. I have a lot of classical music and there are foreign names there, I just want the MP3s with the special characters shown only so I can Google the names.

Is there a way to search for such characters easily?

AMJF · June 14, 2014, 5:49pm

Do you even know what I'm talking about?

ohrenkino · June 14, 2014, 6:23pm

If you yourself have doubts about the clarity of your statements, perhaps you rephrase them?
Also, it could be that there is no

E.g. the filter works if you type in a single "foreign" (not really politcally correct) character.
Just enter a
ü
and you get all the files with an "ü" in the data.
You could experiment with a list of characters in a $replace statement and then compare the length of the original string with the one where you replaced the "foreign" characters with nothing.
But you ruled that out already.

AMJF · June 14, 2014, 7:11pm

After about two hours of starting this post, I used the form:

artist/title HAS "?" (with ? being the foreign character)

And tried as many combinations of A E I O U with the appropriate symbols as I could think of. I didn't know I didn't need the HAS command for that, or even quotes.

I got mostly the results I wanted, but I might extend it now to search for others - I really need a string of the most common foreign characters, really.

And I call them foreign as an umbrella term as I don't know if there's a single name that refers to them all - the individual names are accent, umlaut, etc. Please feel free if you know what that singular term is.

AMJF · June 14, 2014, 7:21pm

Oh, they're called diacritics That's that query satisfied, then.

ohrenkino · June 14, 2014, 8:07pm

That's right. Thank you.
Perhaps this is an idea:
If you know which characters are no diacritics and you remove them, then everything that is left, should be one of those characters that need a closer look.

"$if($eql($len($replace($lower(%title%),a,,b,,c,,d,,e,,f,,g,,h,,i,,j,,k,,l,,m,,n,,o,,p,,q,,r,,s,,t,,u,,v,,w,,x,,y,,z,0
,,1,,2,,3,,4,,5,,6,,7,,8,,9,)),$len(%title%)),1,0)" IS "1"

It does not consider any punctuation yet...

AMJF · June 14, 2014, 9:06pm

When I select the above only the first line is copied, and when I try to paste the second line onto the end of the first in the Mp3tag filter, it doesn't seem to work. Also, I'm sure that there's meant to be two commas between the z and the 0 - am I right? In any case, I tried that, and the best I could get was the titles with all numbers in them.

DetlevD · June 15, 2014, 4:00am

Regarding ...
http://en.wikipedia.org/wiki/ISO/IEC_8859-1#Codepage_layout
... you may use the Mp3tag filter ...
ARTIST MATCHES "[\xc0-\xff]"
... or ...
ARTIST MATCHES "[ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãä
åæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]"

DD.20140615.0810.CEST

ohrenkino · June 15, 2014, 6:17am

I can confirm that although I did not intend it that way. It looks as though the $replace() function does not evaluate more than 32 replacement pairs. So everything after "3" is not considered. I wrote a query in the German forum whether this is a bug or a feature.
Anyway: DetlevD's solution is more elegant.

AMJF · June 15, 2014, 6:24am

Yes, that's it! Thank you!

On a hunch, concerning the mangled artist tag in the track I mentioned, I went back to the website I got the MP3 album from, to remind myself of the error, but they seem to have fixed it - it is now using the correct diacritic, and the title has been fixed as well.

This album, in fact - look at track 9, both title and artist WERE incorrect, out of all of them, but that's down to the person doing the typing, I suppose:
Google Play

I guess someone complained, I don't think it was me In any case, it's discouraging to see such details mangled when you click "Buy". I don't really like to edit the tags of any albums I buy unless there are catastrophic errors or inconsistencies.