Troubleshooting an Existing Regex

Hello Everyone,

Referencing this topic dd. Oct '17.

I apply the following expression in Action Type > Format Value:

$regexp(%ARTIST%,'^.*\s([\w-]+)$','$1')

in order to retain the last word, in a given sentence. In this case the Artist Field where I wish to retain the artist's surname.

Would you please show me where I have it wrong, because it fails to apply as it should to hyphenated last words, such as this one, which remains unaffected:

Camille Saint‐Saëns

Thank you.

That's easy to explain:
Your character between Saint and Saëns is not the same as the character after the \w in your regex.
If you try it with
$regexp(%ARTIST%,'^.*\s([\w-‐]+)$','$1')
$regexp(%ARTIST%,'^.*\s([\w-‐]+)$','$1') <--- both of the dashes looking equal, but they are not!
it should work.


Update some hours later:
In Mp3tag the above format string does NOT work, please have look here


The differences in detail:
The character after \w in your original regex is a
U+002D : HYPHEN-MINUS {hyphen, dash; minus sign}
The character between Saint and Saëns in the artist name is a
U+2010 : HYPHEN
image

You may also encounter one of these other variants of "dashes" (looking very similar, but not equal):

image

Thank you, @LyricsLover, I certainly was not even aware of these variations. I applied your expression as shown in the screenshot above, and received this error message in the Artist field:

REGEXP ERROR: Regular expression
Invalid range end in character class The error occurred while parsing the regular expression: '^.\s([\w->>>HERE>>>]+)$'.*

I then re-applied it like this — using only the slightly elevated hyphen in your expression — and it worked just fine:

image

If at all possible, is it not feasible to construct an expression that combines the most commonly used hyphens, since these are not easy to tell apart?

You are right, sorry for the confusion.
For a Mp3tag regular expression you have to escape the minus dash (after \w) with a backslash, otherwise the minus will be interpreted in a unwanted way.
Please try it with this format string:
$regexp(%ARTIST%,'^.*\s([\w\-‐]+)$','$1')
image

Unfortunately, I'm not aware that an easy way exists for this. You would have to enter all the various dash variants in your regular expression.

Thank you, this one worked pefectly.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.