Help with Regex


#1

I'm trying to standardize my library.
I have multi disc albums with titles ... (disc x), [disc x], disc x, disc x of x and maybe one or two other forms.

What I'm trying to do is populate a discnumber tag, with the value of "x", instead of using (disc x) etc.

I've read, the help files on scripting and regex, FAQs, regex portion of this forum, and have even been throght a couple tutorials on the net and still haven't been able to figure out the matching expression.

Here's what I have.
[([]?disc.\d\d?

My understanding is of this is,
[([]? - match an optional "(" or "[" followed by the string disc, followed by any charicter, followed by a 1 or 2 digit number.

This works fin in the filter box. The problem is, every time I try to match the closing ")" or "]" I get either zero results or it doesn't filter anything.

I've tried escaping the ) and ] putting enclosing them in brackets and parenthesis.

How do I match the closing ) or ] if it's there?


#2

My understanding of you problem:
You want to format the field DISCNUMBER acording to the information given in the field ALBUM. You do not want to delete the discnumber-information from the ALBUM field afterwards.
What you write about the filter was just for testing regular expressions

try this:
Action: Format Value
Field: DISCNUMBER
Formatstring: $regexp(%album%,.disc (\d).*,$1)

or maybe this fits even more cases:
$regexp(%album%,.(disc|Disc|CD|cd) (\d).*,$2)


#3

Actually I do want to delete the discnumber information from the ALBUM after I get DISCNUMBER populated.
I'm just taking this one step at a time.

Thank you pone. The second one seems to work for populating DISCNUMBER.

With a slight modification I get everything but the first " (" removed from ALBUM
Action: Format Value
Field: ALBUM
Formatstring: $regexp(%album%,(^.)(disc|Disc|CD|cd) (\d).*,$1)

It seem like everything I try to include that first ( results in a syntax error.


#4

Try this:
$regexp(%album%,(.)( '('| '['|',' | - )(disc|Disc|CD|cd)( |)(\d).*,$1)

And one step further, both operations in one action:
Action: Guess Values
Sourceformat: $regexp(%album%,(.)( '('| '['|',' | - )(disc|Disc|CD|cd)( |)(\d).*,$1XXXXX$5)
Formatstring: %album%XXXXX%discnumber%


#5

Thanks again pone. It works beautifully.

If I understand this code correctly, it says
Look in ALBUM for (any character, any number of times)(followed by either ( [ , space - space)(followed by either disc Disc CD cd)(I not sure what this means here)(followed by any number of digits)followed by any number of characters.
Put the results of the first set of parenthesis into ALBUM
Put the results of the fifth set of parenthesis into DISCNUMBER
I'm not sure about the fourth set of parenthesis. It looks like it may be for either space character or nothing at all.
I have no clue what all those X's are for.


#6

You understand the code correctly.
You are also right about the fourth parenthesis. I included this for the case where the discs are named CD1, disc2 without space.

With Guess Value the result of the first string is put into a "Sourceformat", not into a tag field. This is a kind of temporary field within the action. This sourceformat is than read by the second formatstring. I've put XXXXX as a marker between $1 and $5, so that the formatsting knows where to split the sourceformat into different tag field. You can replace XXXXX by any character or set of characters you want, it's only important to choose something which is surely not part of an album name, because in this case the name would be splitted.

If all your albums were simple named "albumname - cd x" we would not need the regular expressions and would only write:
Sourceformat: %album%
Foramtstring: %album% - cd %discnumber%

so " - cd " whould be the field-split-marker instead of "XXXXX".


#7

Is it necessary to have five X's, in this case?


#8

Sometimes it helps to make three crosses, in your case maybe three times three crosses would be recommended, but who knows the mystic of Mp3tag for sure?

DD.20110131.0658.CET


#9

You have four slightly different cases and maybe one or other formats, which format you have let be unknown. So we have following test cases, i. e. the text in a TITLE tag-field.
aaa (disc 1) bbb
aaa [disc 2] bbb
aaa disc 3 bbb
aaa disc 14 of 14 bbb

Using the Filter [F3] ...
TITLE MATCHES "^.?[ [(]disc (\d+)[]) ].?$"
... these test cases can be shown in Mp3tag list view.

Using the scripting expression ...

$regexp(%TITLE%,'^.*?[ ([]disc (\d+)[]) ].*?$','$1',1)

... the digits right beside the word "disc" (writing case not important) will be extracted.
aaa (disc 1) bbb ==> 1
aaa [disc 2] bbb ==> 2
aaa disc 3 bbb ==> 3
aaa disc 14 of 14 bbb ==> 14

Use an Action "Format value" to get the disc number into the Tag-field DISCNUMBER.
Field: DISCNUMBER
Formatstring: $regexp(%TITLE%,'^.?[ ([]disc (\d+)[]) ].?$','$1',1)

DD.20110131.0738.CET

This RegExp could do the work too ...

$regexp(%TITLE%,'^\D*(\d+).*','$1')

... returns the first digits from the left side of the string.

DD.20110131.0754.CET

Or you can use a small action group ... not thoroughly tested.

Begin Action Group Test 2011#20110131.Snuffy.DISCNUMBER

Action #1
Actiontype 7: Import tag fields (guess values)
Source format: $lower(%TITLE%)
Guessing pattern: %DUMMY%disc÷%DISCNUMBER%÷%DUMMY%

Action #2
Actiontype 5: Format value
Field: DISCNUMBER
Formatstring: $num(%DISCNUMBER%,1)

Note: Replace each special ÷ character with one space character.
End Action Group Test 2011#20110131.Snuffy.DISCNUMBER (2 Actions)

DD.20110131.0816.CET


#10

I just wanted to get sure. If the is a album which is called "XXX rated", three would not be enough. I think XXXXX is probably not part of any title.