Regular Expression to Extract Album


#1

Hope someone can help please

I am trying to clean up my album names for multi disc albums, by using the discnumber tag rather than a suffix on the album name

When i format value album with
$regexp(%album%,(.)(disc|cd)(\D{0,3})(\d{1,2})(.),$1,1)

The Complete Singles Collection (CD 1) get turned into
The Complete Singles Collection (

how can i move whitespace and '(' out of $1 please. NB some albums legitimately end in ')', and not all albums have discnumbers in them; The expression seems to only match the right albums and does not create false positives.

i have tried various character classes and escapings of '(' of optional existence and length prior to disc|cd but not getting anywhere.

in English I want to say $1 should be as greedy as possible but not take any non alpha character or ')' ;eg '(',\s,_;, that might immediately (say up to 3 characters) preceed disc|cd. If there is no match then the album doesn't contain a disc number so dont do anything.

thank you


#2

$regexp(%album%,'\s+(?(disc|cd)\D{0,3}\d{1,2})?',,1)


#3

thank you very much

so what does putting the expression in '' do...allow '(' to work?
was it significant that you removed the capturing () placeholders / $N method..or was that just a choice?

do you know how i can build out
\s+(?

to say \s*( but appear in any order any number of times (to capture all possible messy album names to disk suffix 'joins')...would a chracter class work ? [\s(]*..


#4

'...' is used because some characters like ( [ have a special function in the Mp3tag scripting language. So if you make a longer expression it's just safer and more readable surround the whole term with '
With ( ) it can work without ' but not always.

I removed the $n capturing because you just want to remove a certain part and it doesn't matter what comes before it.

[\s_(]* should work. You can also write [\s_(]*


#5

thanks again